SEARCHING AND FINDING INFORMATION ONLINE (ch 6) - name some search engines - query: question you want an answer to or topic you search on - sample queries - types of queries - book has 3 types - open-ended exploratory question (voyager): curious about something & want to see whats out there - deep-thought question: also open-ended but more focused (comes from meaning of life/42) - joe friday question (just the facts) - type of query determines what kind of internet resource you should use to find it - queries can change as you're searching (e.g., joe friday queries can become deep thought questions) - subject tree: hierarchically organized directory of websites - clearinghouse: collection of websites or online documents about a specific topic. - if its a broad topic, it might be organized hierarchically, but a clearinghouse is still always more specific than a subject tree - general search engine: indexes a large # of webpages via keywords - relies on automatic methods (spiders) to "crawl" the web in search of pages to add to the index - not restricted in topic - specialized search engines: like a general search engine except limited to a specific topic. Relies on people to handpick documents. - subject trees/clearinghouses - really browsing aids, they require exploration - yahoo, about.com, open directory project (dmoz.org) - organize documents in a hierarchy of categories, although sometimes these categories aren't intuitive - larger subject trees are equipped with search engines to search the categories and sites for you: use a site search (restricted to that main site) - so subject trees can have search engines - but search engines that have subject trees are different -- they search the whole web and might return documents not in their tree - category matches and site matches: categories are more important - browse through sites in each category: someone has already figured out what the best sites are for each category - about.com: each category in the subject tree is maintained by a human expert. - not as many pages as yahoo, but sites are hand-reviewed and policed for content, reliability, and accuracy - open directory project - started by lycos to compete with yahoo emphasizes practical know-how rather than academic expertise - uses volunteers who act as editors in specific categories - CLEARINGHOUSES: large collection of resources or documents on a specific topic. - some are maintained by researchers, some by the government, some by private commercial organizations that charge $$, some compiled by librarians or teachers - environmental law net, infosyssec (computer security), webmd - find clearinghouses using a clearinghouse index (ipl.org) - very useful if you spend much of your time looking up information on one topic general search engines/meta search engines - most common ones you've heard of, they use keywords - know how to use your search engine can use fancy queries to return pages that use all words, some words, wildcards, synonyms - never look beyond 20-30 hits - experiment with different keywords: successive query refinement - examine hits that come back, add or remove keywords to broaden or narrow - meta search engines search many search engines at once, and return lists of hits from each one - they are careful not to swamp you with hits or duplicates of top sites - some will cluster the hits together into categories, which you then can filter query refinement - exact phrase matching - use quotes - might be case sensitive - especially useful with proper names, although have to watch out for middle initials, nicknames, etc - useful for a specific work that must contain a certain phrase such as a work of shakespeare, or song lyrics - title search: only searches keywords in the title of the page - sometimes have to experiment (zebras vs zebra) - wildcard matching *, ?, # (careful with short roots) - boolean queries AND OR NOT NEAR - (A & B) | C vs A & (B | C) - required and prohibited keywords (+, -) specialized search engines - find them using search engines for search engines (searchengineguide.com) - ex: find news (news.google.com) or pictures (images.google.com) local information, products assessing credibility on the web - anyone can put up anything they want on the web, true or false - evaluate the credibility of a site by its content, not its look and feel - a page is useless for research if it doesn't identify its author - look for a link to the authors homepage or for a short biography - check for .gov or .edu address, but then students have those too - verify the author is who they say they are (check main site) - accurate writing and documentation - spelling, grammar, where they got their info from - objectivity -- commercial company? - stability of webpages -- they change all the time, can disappear tomorrow. - include a date of update? last revised? - fradulent webpages (political candidates)