Nonprofit Common Crawl Offers a Database of the Entire Web, for Free, and Could Open Up Google to New Competition - LGF Pages

Sign In • Register • Forgot password?

Nonprofit Common Crawl Offers a Database of the Entire Web, for Free, and Could Open Up Google to New Competition

Internet • January 2013 • Views: 1,284

Common Crawl supplies a database of over five billion Web pages in the hope that it will inspire new research or online services.

Why It Matters

A freely available copy of billions of Web pages could create competition for established giants such as Google.

Google famously started out as little more than a more efficient algorithm for ranking Web pages. But the company also built its success on crawling the Web—using software that visits every page in order to build up a vast index of online content.

A nonprofit called Common Crawl is now using its own Web crawler and making a giant copy of the Web that it makes accessible to anyone. The organization offers up over five billion Web pages, available for free so that researchers and entrepreneurs can try things otherwise possible only for those with access to resources on the scale of Google’s.

“The Web represents, as far as I know, the largest accumulation of knowledge, and there’s so much you can build on top,” says entrepreneur Gilad Elbaz, who founded Common Crawl. “But simply doing the huge amount of work that’s necessary to get at all that information is a large blocker; few organizations … have had the resources to do that.”

New search engines are just one of the things that can be built using an index of the Web, says Elbaz, who points out that Google’s translation software was trained using online text available in multiple languages. “The only way they could do that was by starting with a massive crawl. That’s put them on the way to build the Star Trek translator,” he says. “Having an open, shared corpus of human knowledge is simply a way of democratizing access to information that’s fundamental to innovation.”

More: Nonprofit Common Crawl Offers a Database of the Entire Web, for Free, and Could Open Up Google to New Competition

0

Common Crawl Google Web Crawler Computers Database Human Knowledge Artificial Intelligence

Recent Pages by Bobibutu (Bob Dillon):
Rate Shock: In California, Obamacare to Increase Individual Health Insurance Premiums by 64-146% Court Says Egypt Legislature Illegally Elected MERS Spreads to Italy, Kills 3 More in Saudi Arabia Thousands Take to Streets in Turkey, Clash With Police China Blasts RP Gov't for Using Grounded Ship as Spratlys Outpost

no comments

This page has been archived.
Comments are closed.

Create a PageThis is the LGF Pages posting bookmarklet. To use it, drag this button to your browser's bookmark bar, and title it 'LGF Pages' (or whatever you like). Then browse to a site you want to post, select some text on the page to use for a quote, click the bookmarklet, and the Pages posting window will appear with the title, text, and any embedded video or audio files already filled in, ready to go.
Or... you can just click this button to open the Pages posting window right away.
Last updated: 2023-04-04 11:11 am PDT LGF User's Guide RSS Feeds

Help support Little Green Footballs!

Featured PagesClick to refresh: An Appeal to Heaven An ancient political phrase and a historic flag of the United States has entered the discourse as a flag of Christian Nationalism, insurrection, and the question of Supreme Court Associate Justice Samuel Alito's objectivity to rule on cases. So ...
Anymouse 🌹🏡😷
9 hours ago
Views: 123 • Comments: 2 • Rating: 5; Grocers Are Finally Lowering Prices as Consumers Pull Back Consumers have been grumbling about the soaring cost of groceries for nearly two years. Now, some of the biggest names in retail appear to be listening. In recent weeks, Target and Aldi have broadcast price cuts on thousands of ...
Cheechako
Yesterday
Views: 93 • Comments: 0 • Rating: 2; A Water War Is Brewing Between the U.S. And Mexico. Here’s Why A water dispute between the United States and Mexico that goes back decades is turning increasingly urgent in Texas communities that rely on the Rio Grande. Their leaders are now demanding the Mexican government either share water or face ...
Cheechako
6 days ago
Views: 288 • Comments: 0 • Rating: 3; Harper’s Magazine: Slippery Slope - How Private Equity Shapes a Ski Town …Big Sky stands apart for other reasons. The obvious distinction is the Yellowstone Club, a private resort hidden in the mountains above the community that Justin Farrell, a professor of sociology at Yale and the author of Billionaire Wilderness, ...
teleskiguy
2 weeks ago
Views: 461 • Comments: 1 • Rating: 5; Hawaii’s Mauna Loa Observatory Just Captured Ominous Signals About the Planet’s Health Hawaii’s Mauna Loa Observatory just captured an ominous sign about the pace of global warming. Atmospheric levels of planet-warming carbon dioxide aren’t just on their way to yet another record high this year - they’re rising faster than ever, ...
Cheechako
2 weeks ago
Views: 1,206 • Comments: 0 • Rating: 3; The Good Liars at the Schnecksville Trump Rally [VIDEO] New theories, great tunes and SHOCKING breaking news. SUPPORT US: http://Herohero.co/thegoodliars SEE THE GOOD LIARS LIVE!WASHINGTON D.C. MAY 23RD: https://www.unionstage.com/shows/good-liars-fix-america/NASHVILLE, TN JUNE 6TH: https://www.etix.com/ticket/p/42992972/the-good-liars-nashville-the-lab-at-zaniesSAN FRANCISCO, CA JUNE 25TH: https://www.livenation.com/event/G5vYZbavGvggG/the-good-liars SUBSCRIBE TO OUR AUDIO PODCAST:Apple Podcasts: https://podcasts.apple.com/us/podcast/the-good-liars-tell-the-truth/id1731178442Spotify: https://open.spotify.com/show/7mgfiwzr32907N4y68eFOCJoin this channel ...
teleskiguy
3 weeks ago
Views: 1,003 • Comments: 1 • Rating: 0; Trump’s “Stolen Election” Lie Based on Evidence From Pervy Bathroom Cam-Spy OK, this really takes the cake. If you have relatives that still cling to the “election was stolen, dadgum, I jes’ KNOW IT … This should be a slight remedy to the stubborn madness Thanks to online anonymity, the ...
Khal Wimpo (free internal organs upon request!)
3 weeks ago
Views: 536 • Comments: 0 • Rating: 3; Best of April 2024 Nothing new here but these are a look back at the a few good images from the past month. Despite the weather, I was quite pleased with several of them. These were taken with older lenses (made from the ...
William Lewis
4 weeks ago
Views: 474 • Comments: 2 • Rating: 6; Gateway Pundit, Sued by Election Workers, Declares BankruptcyA onetime favorite, now just pathetic figure around these parts, Jim Hoft aka SMOTI ("Stupidest Man On The Internet"), has filed for Chapter 11 bankruptcy in response to the defamation lawsuits filed against him to the same election workers that ...
Khal Wimpo (free internal organs upon request!)
4 weeks ago
Views: 553 • Comments: 1 • Rating: 4; The Pandemic Cost 7 Million Lives, but Talks to Prevent a Repeat Stall In late 2021, as the world reeled from the arrival of the highly contagious omicron variant of the coronavirus, representatives of almost 200 countries met - some online, some in-person in Geneva - hoping to forestall a future worldwide ...
Cheechako
last month
Views: 1,224 • Comments: 0 • Rating: 2

Recent PagesClick to refresh

► LGF Headlines

Loading...

► Top 10 Comments

Loading...

► Bottom Comments

Loading...

► Recent Comments

Loading...

► Tools/Info

► Tag Cloud