Sunday, July 26, 2009

Googling the Right Way (a repost)

Notes

This is a republication of an article I wrote in early 2008, touching lightly upon the topic of Google queries and the mighty search-giants history. I first published the text on my then pet project site, pworks (which by the way now redirects to tenzui), and from there it got mirrored on a couple of technology-focused sites and forums

Since writing this, both I and the world of search has progressed greatly, and I hope to be able to write a follow-up, more in-depth post on the topic of searching in the future. So if you're interested in that - stay connected. ;)

Short History

Right, this will be a really, really short history lesson. If you're interested, check out what the people over there has written for yourself. (Link at page bottom)

So, Google was created by the duo Larry Page and Sergey Brin, two Stanford grad students who, although they didn't see eye-to-eye on many topics, were determined to crack the quite boring nut of organizing all that information that was spread out on the web. By 1997, their BackRub search engine had started gaining a sparkling reputation for its unique way of analyzing and ranking webpages through "back links", links pointing to a page from other pages. The system also gained attention for its interesting server environment, contrary to the "normal" high-end servers, BackRub ran on a collection of simpler PCs, collected from the campus' nooks and crannies.

From there, the story is one of unfathomable success ("Instead of discussing all the details, why don't I just write you a check?"), leading to the status of The One Search-engine we all know, love and envy.

PageRank

"Back links?" you think. Yeah, Google's system of deciding what pages are worth your reading-cycles differed from all other search engines' way at the time. The PageRank algorithm ranks all sites by giving them a rank between 0 and 10, based on how many other pages are linking to the site, and what value the linking pages has.
If you are interested in the mathematics between the PageRank algorithm, I suggest you read about it on Wikipedia. The logics behind PR is not in the scope of this article.

From this information, you can probably figure out the basics of SEO, Search Engine Optimization. Get your page linked to by the big boys. Of course, some people just can't be content with playing by the rules, and the PR-algorithm isn't perfect, so from time to time someone manages to fool the algorithm, an example being the 302 Google Jack, redirecting the new, zero-ranked page to a rank ten page, like Google itself. When Google updates the PageRanks, the new page will get the same rank as the page it linked to. Other people buys and sells high-valued links, really a kind of advertising, but with a big debate buzzing in the background. Google has requested that such links use the HTML attribute "nofollow", rendering the page linked to to be ignored when re-ranking.

The above mentioned kinds of tricks, as well as many others, can of course lead your page to get devalued, meaning that it will not be ranked at all. Play safe!

Basics

Every Joe Schmoe knows that search-engines like Google doesn't take kindly too long sentences and free-text, but he probably never bothered reading up on how the magical searchbox actually works, something he should be severely punished for. Let's leave Joe to his fate, and rise far above him, to the lands without stupid questions.
Even in the "basic" syntax collection I'm sure you are able to find a few sparkling gems you didn't know about, so skim through it even if you feel confident in your Google-Fu.

So, top down, a standalone word yields pages containing that word, a sentence enclosed with quotation-marks (" ") similarly yields pages that contain that exact phrase. If you have ever created an SQL-query for some database, I'm sure you will find a lot of similarities as we go on now. Google is actually "just a database", remember?

Command Example Result

AND [&] (ampersand) Slackware AND Linux Shows pages containing both arguments, *OBS* this is the default operator, no need to include
OR [|] (pipe) laptop OR Desktop Shows pages containing either argument
- (minus) Hamburger -McDonalds Shows pages containing the word "Hamburger", but only if they don't mention "McDonalds"
+ (plus) +coke Contrary to the "includes" belief, this limits the results to the given form only, no pluralis or other tenses
~ (tilde) ~Hacker Results include everything deemed similar to "Hacker"
* (asterisk) Fish * Chips The wildcard (*) is replaced by one or more words/characters (and, n, 'n, &)
define: define:Nocturnal A personal favorite, looks up the meaning of the word
site: Phreaking site:phrack.org Limits the search to a specific site
#...# zeroday 2007...2008 Search results include a value within the given range
info: info:www.hacktivismo.com Shows information about the site
related: related:www.google.com Shows pages similar/related to argument
link: link:www.darkmindz.com Shows sites linking to the argument
filetype: phrack filetype:pdf Results are limited to given filetype
([?]) Cyber (China & America) Nestling combines several terms in the same query
[?A] in [?B] 1 dollar in yen Converts argument A to argument B
daterange: daterange:2452122-2452234 Results are within the specified daterange. Dates are calculated by the Julian calendar
movie: movie:Hackers Movie reviews, can also find movie theaters running the movie in U.S cities
music: music:"Weird Al" Hits relate to music
stock: stock: goog Returns stock information (NYSE, NASDAQ, AMEX)
time: time: Stockholm Shows the current time in requested city
safesearch: safesearch: teen Excludes pornography
allinanchor: allinanchor: Best webcomic ever" Results are called argument by others
inanchor: foo bar inanchor:jargon As above, but not for all. The corresponding below all bear the same meaning
allintext: allintext:8-bit music Argument exists in text
intext:
allintitle: allintitle: Portfolio Argument exists in title
intitle:
allinurl: allinurl:albino sheep Argument exists in URL
inurl:

Advanced
GET-variable breakdown
http://www.google.com/search?
as_q=test (query string)
&hl=en (language)
&num=10 (number of results [ 10,20,30,50,100 ])
&btnG=Google+Search
&as_epq= (complete phrase)
&as_oq= (at least one)
&as_eq= (excluding)
&lr= (language results. [ lang_countrycode ])
&as_ft=i (filetype include or exclude. [i,e])
&as_filetype= (filetype extension)
&as_qdr=all (date [ all,M3,m6,y ])
&as_nlo= (number range, low)
&as_nhi= (number range, high)
&as_occt=any (terms occur [ any,title,body,url,links ])
&as_dt=i (restrict by domain [ i,e ])
&as_sitesearch= (restrict by [ site ])
&as_rights= (usage rights [ cc_publicdomain, cc_attribute, cc_sharealike, cc_noncommercial, cc_nonderived ]
&safe=images (safesearch [ safe=on,images=off ])
&as_rq= (similar pages)
&as_lq= (pages that link)
&as_qdr= (get only recently updated pages d[ i ] | w[ i ] | y[ i ])
&gl=us (country)

Googledorks

So, Google gives us all those handy tools for filtering away what we don't want to see, how can we use this to help securing our own systems?

Well, for example, we could use the neat Google Hacking Database, a project where people has submitted a huge collection of queries yielding results that the unskilled webmaster (the Googledork) wishes weren't there. Everything from vulnerable login-forms to passwords surfaces with some cleverly engineered queries.

Goolag

Goolag is a vulnerability scanner (and a politically involved protest..) made by the famous Cult of the Dead Cow. It builds on the above mentioned GHDB, scanning for vulnerabilities in the database. At the moment there is only a Windows-version of the program. The Goolag project is also a campaign against Google's (and a few other big players') choise to comply with the Chinese censorship policy.

Useful Queries

-inurl:htm -inurl:html intitle:"index of" "Last modified" mp3 mp3-file indexes, add desired artist
site:rapidshare.de -filetype:zip OR rar daterange:2453402-2453412 zip files on rapidshare uploaded on specified date
http://www.google.com/search?q=your+query+here&as_qdr=d1 Query results updated within one day

Others

http://www.google.com/search?q=answer to life, the universe, and everything
http://www.churchofgoogle.org
http://www.google.com/technology/pigeonrank.html

References
http://www.google.com/help/cheatsheet.html
http://www.dumblittleman.com/2007/06/20-tips-for-more-efficient-google.html
http://www.googleguide.com/advanced_operators_reference.html
http://sudarmuthu.com/blog/2006/05/07/google-search-syntax-dissected.html
http://en.wikipedia.org/wiki/PageRank
http://johnny.ihackstuff.com/

41 comments:

  1. After long searching they have met with this option.as from here many one are taking many tips to lead with their business easily.So i really admire this.



    Voice Recorder

    ReplyDelete
  2. This blog deserves a legendary award and very wonderfully composed blog.
    best kayak fish finder

    ReplyDelete
  3. Some truly wonderful work on behalf of the owner of this internet site , perfectly great articles . Spotify premium apk download

    ReplyDelete
  4. Admiring the time and effort you put into your blog and detailed information you offer!.. free printable wall art

    ReplyDelete
  5. I would like to thank you for the efforts you have made in writing this article. I am hoping the same best work from you in the future as well. In fact your creative writing abilities has inspired me to start my own Blog Engine blog now. Really the blogging is spreading its wings rapidly. Your write up is a fine example of it. Epictext News

    ReplyDelete
  6. Thanks for sharing this information. I really like your blog post very much. You have really shared a informative and interesting blog post with people.. here

    ReplyDelete
  7. Wow, cool post. I'd like to write like this too - taking time and real hard work to make a great article... but I put things off too much and never seem to get started. Thanks though. website photographer brisbane

    ReplyDelete
  8. I am a new user of this site so here i saw multiple articles and posts posted by this site,I curious more interest in some of them hope you will give more information on this topics in your next articles. buy targeted traffic

    ReplyDelete
  9. The writer is enthusiastic about purchasing wooden furniture on the web and his exploration about best wooden furniture has brought about the arrangement of this article. sources for affiliate marketing

    ReplyDelete
  10. We are really grateful for your blog post. You will find a lot of approaches after visiting your post. I was exactly searching for. Thanks for such post and please keep it up. Great work. Wealth Manager

    ReplyDelete
  11. This is my first visit to your web journal! We are a group of volunteers and new activities in the same specialty. Website gave us helpful data to work. Newborn Photography

    ReplyDelete
  12. Wonderful blog post. This is absolute magic from you! I have never seen a more wonderful post than this one. You've really made my day today with this. I hope you keep this up! consulenza web marketing milano

    ReplyDelete
  13. Both restaurants serve meat and potatoes. At White Castle for under $3.00 you can get a couple of hamburgers and an order of fries. thelovepets

    ReplyDelete
  14. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing. Top stocks

    ReplyDelete
  15. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more. Seo kosten

    ReplyDelete
  16. I found your this post while searching for some related information on blog search...Its a good post..keep posting and update the information. pakistani smm panel

    ReplyDelete
  17. Awesome article, it was exceptionally helpful! I simply began in this and I'm becoming more acquainted with it better! Cheers, keep doing awesome! Outdoorstack.com

    ReplyDelete
  18. Fundamentally SEO is a typical strategy utilized primary in website architecture to improve a sites SERP (search engine results page) for additional traffic. Wat is leadgeneratie

    ReplyDelete
  19. Content is a distinct advantage: Many occasions, SEO chief or a leader probably won't know which watchwords have better worth? seo birmingham

    ReplyDelete
  20. Thanks for a wonderful share. Your article has proved your hard work and experience you have got in this field. Brilliant .i love it reading. daycare photography in Sydney.

    ReplyDelete
  21. These sorts of photographs should be inventively and pointedly taken. We can catch them anyplace like, in a studio, in a condo or anyplace outside. We need to do the pre-shooting readiness prior to outlining the item. The arrangement may incorporate cosmetics, styles, area, lighting and clearly the model assumes an exceptionally imperative part. don't forget the rule of thirds when taking shots

    ReplyDelete
  22. Hey what a brilliant post I have come across and believe me I have been searching out for this similar kind of post for past a week and hardly came across this. Thank you very much and will look for more postings from you. buying Adderall online

    ReplyDelete
  23. Superior post, keep up with this exceptional work. It's nice to know that this topic is being also covered on this web site so cheers for taking the time to discuss this! Thanks again and again! subliminal mp3

    ReplyDelete
  24. Thank you because you have been willing to share information with us. we will always appreciate all you have done here because I know you are very concerned with our. how to start a healthy lifestyle

    ReplyDelete
  25. Get used to the fact that some people will do a better job of pushing your online business to the virtual world, than you would do. Get them to do just that for you by becoming affiliates. how to get Instagram story views

    ReplyDelete
  26. http://cuttingedgewindowtinting.co/ Your content is nothing short of brilliant in many ways. I think this is engaging and eye-opening material. Thank you so much for caring about your content and your readers.

    ReplyDelete
  27. pressure washing We have sell some products of different custom boxes.it is very useful and very low price please visits this site thanks and please share this post with your friends.

    ReplyDelete
  28. It is quite beneficial, although think about the facts when it reaches this target. Email Extractor Software

    ReplyDelete
  29. This particular is usually apparently essential and moreover outstanding truth along with for sure fair-minded and moreover admittedly useful My business is looking to find in advance designed for this specific useful stuffs… best link tracking service

    ReplyDelete
  30. Most of the time I don’t make comments on websites, but I'd like to say that this article really forced me to do so. Really nice post! content writing

    ReplyDelete
  31. I am jovial you take pride in what you write. It makes you stand way out from many other writers that can not push high-quality content like you. sim swap

    ReplyDelete
  32. A great content material as well as great layout. Your website deserves all of the positive feedback it’s been getting. I will be back soon for further quality contents. sim swap

    ReplyDelete
  33. or are they merely what one perceives them to be? Let us focus on what God has created women to be and what society tells them to be10 step korean skincare kit

    ReplyDelete
  34. and the harmony between the first couple and all creation, is called "original justice."korean skin care

    ReplyDelete
  35. You understand your projects stand out of the crowd. There is something unique about them. It seems to me all of them are brilliant. business management books pdf

    ReplyDelete
  36. A very awesome blog post. We are really grateful for your blog post. You will find a lot of approaches after visiting your post. Bedok mall

    ReplyDelete
  37. Really a great addition. I have read this marvelous post. Thanks for sharing information about it. I really like that. Thanks so lot for your convene. BioHazard Cleanup

    ReplyDelete