Notes
Slide Show
Outline
1
Database Searching
LS 574
  • Summer Session II
  • Clarion University of Pennsylvania
  • July – August 2005
2
What Are Databases
Made Of?
  • Databases or files
    • Subfiles
      • Records
        • Fields
          • Subfields
3
A Sample Database Record
  • Personal author: Garreau, Joel.
  • Title: The nine nations of North America / Joel Garreau.
  • Publication info: Boston : Houghton Mifflin, 1981.
  • ISBN: 0395291240 : $12.95
  • Physical description: xvii, 427 p. : ill. ; 24 cm.
  • General note: Includes index.
  • Bibliography note: Bibliography: p. [397]-413.
4
A Sample Database Record (cont.)
  • Personal subject: Garreau, Joel.
  • Subject: United States--Description and travel--1960-
  • Subject: United States--Economic conditions--1961-
  • Subject: United States--Social conditions--1960-
  • Subject: Canada--Description and travel--1945-
  • Subject: Canada--Economic conditions--1945-
  • Subject: Canada--Social condition.


5
The Record’s Underlying Structure
  • 100:  10 : Garreau, Joel.
  • 245:  14 : The nine nations of North America /|cJoel Garreau.
  • 260:  0  : Boston :|bHoughton Mifflin,|c1981.
  • 020:     : 0395291240 :|c$12.95
  • 300:     : xvii, 427 p. :|bill. ;|c24 cm.
  • 500:     : Includes index.
  • 504:     : Bibliography: p. [397]-413.
  • 596:     : 5
6
The Record’s Underlying Structure (cont.)
  • 600:  10 : Garreau, Joel.
  • 651:   0 : United States|xDescription and travel|y1960-
  • 651:   0 : United States|xEconomic conditions|y1961-
  • 651:   0 : United States|xSocial conditions|y1960-
  • 651:   0 : Canada|xDescription and travel|y1945-
  • 651:   0 : Canada|xEconomic conditions|y1945-
  • 651:   0 : Canada|xSocial condition.



7
Matt’s Presidential Literature Database
  • RN - 101
  • AU - Mazlish, Bruce
  • TI - In Search of Nixon : A Psychohistorical Inquiry
  • SU – Nixon, Richard M. (Richard Milhouse), 1913-1994


8
Matt’s Presidential Literature Database (cont.)
  • RN – 102
  • AU – Eisenhower, Dwight D.
  • TI – At Ease : Stories I Tell My Friends
  • SU – Eisenhower, Dwight D. (Dwight David), 1890-1969
9
Matt’s Presidential Literature Database (cont.)
  • RN – 103
  • AU – Ellis, Joseph J.
  • TI – Passionate Sage : The Character and Legacy of John Adams
  • SU – Adams, John, 1743-1826


10
Basic vs Additional Index
  • For our toy database, our titles and subjects comprise the basic index
  • Our additional index will be the author index
11
Dialog’s Stop Words
  • an
  • and
  • by
  • for
  • from
  • of
  • the
  • to
  • with


12
Number all Words and/or Phrases
  • Mazlish, Bruce 101 AU 1
  • in 101 TI 1
  • search 101 TI 2
  • nixon 101 TI 3
  • a 101 TI 4
  • psychohistorical 101 TI 5
  • inquiry 101 TI 6
13
Number all Words and/or Phrases (cont.)
  • nixon 101 SU 1
  • richard 101 SU 2
  • m 101 SU 3
  • richard 101 SU 4
  • milhouse 101 SU 5
  • 1913 101 SU 6


14
Number all Words and/or Phrases (cont.)
  • 1994 101 SU 7
  • Nixon, Richard M. (Richard Milhouse), 1913-1994 101 SU 8


  • Eisenhower, Dwight D. 102 AU 1
  • at 102 TI 1
  • ease 102 TI 2
15
Number all Words and/or Phrases (cont.)
  • stories 102 TI 3
  • i 102 TI 4
  • tell 102 TI 5
  • my 102 TI 6
  • friends 102 TI 7
  • eisenhower 102 SU 1
  • dwight 102 SU 2
16
Number all Words and/or Phrases (cont.)
  • … And so on …


  • Next Step à Alphabetize the list for the basic index
17
Alphabetize the List

  • 1743 103 SU 3
  • 1826 103 SU 4
  • 1890 102 SU 6
  • 1913 101 SU 6
  • 1969 102 SU 7
  • 1994 101 SU 7
18
Alphabetize the List (cont.)
  • a 101 TI 4
  • adams 103 TI 6
  • adams 103 SU 1
  • Adams, John, 1743-1826 103 SU 5
  • at 102 TI 1
  • character 103 TI 3
  • d 102  SU 3
19
Alphabetize the List (cont.)
  • david 102 SU 5
  • dwight 102 SU 2
  • dwight 102 SU 4
  • ease 102 TI 2
  • eisenhower 102 SU 1
  • Eisenhower, Dwight D. (Dwight David),
  • 1890-1969 102 SU 8


  • … and so on



20
Author Additional Index
  • Eisenhower, Dwight D. 102 AU 1
  • Ellis, Joseph J. 103 AU 1
  • Mazlich, Bruce 101 AU 1


  • Note that this index is phrase indexed but NOT word indexed
21
A Record from Dialog
  • Let’s look beyond the book record of a catalog:
  • http://library.dialog.com/bluesheets/html/bl0001.html
22
Does the Internet have Structure?
  • http://www.contrib.andrew.cmu.edu/~matthewm/mrmwork.html
23
Database Types
  • By Format:  Online, CD-ROM, Tape
  • By Structure
    • Bibliographic
    • Directory
    • Fulltext
    • Citation – SciSearch http://library.dialog.com/bluesheets/html/bl0034.html
    • Financial – Duns Financial Records Plus http://library.dialog.com/bluesheets/html/bl0519.html
    • Numeric - Metals Data File http://www.cas.org/ONLINE/DBSS/mdfss.html
24
Starting a Search in ERIC on Dialog
  • ?BEGIN 1
  • or, in an abbreviated format:
  • ?b 1
25
"?B"
  • ?B 1
  •        30may03 22:38:44 User556323 Session D1.1
  •             $0.00    0.241 DialUnits FileHomeBase
  •      $0.00  Estimated cost FileHomeBase
  •      $0.04  INTERNET
  •      $0.04  Estimated cost this search
  •      $0.04  Estimated total session cost   0.241 DialUnits
  •  File   1:ERIC  1966-2003/May 10
  •        (c) format only 2003 The Dialog Corporation
  •       Set  Items  Description
  •       ---  -----  -----------
  • ?
26
Searching for a Word
  • ?SELECT mathematics


  • or, in abbreviated format:


  • ?s mathematics


27
 
28
Using Boolean (Logical) Operators
  • OR Operator Use OR to group synonymous terms
  • when at least one must be present.



  • AND Operator Use AND to connect terms when both
  • or all  must be present.



  • NOT Operator Use NOT to exclude records containing
  • a specified term.


29
Venn Diagram Depicting AND Logic
30
Venn Diagram Depicting OR Logic
31
 
32
Using OR Operator; Display Sets (DS) Command
  • ?S MATH OR MATHEMATICS
  •            10825  MATH
  •            54890  MATHEMATICS
  •       S5   58984  MATH OR MATHEMATICS
  • ?DS
  • Set     Items   Description
  • S1      54890   MATHEMATICS
  • S2       4055   FEAR
  • S3        116   S1 AND S2
  • S4     169200   1 AND 2
  • S5      58984   MATH OR MATHEMATICS
  • ?
33
"?S"
  • ?S S2 AND S5
  •             4055  S2
  •            58984  S5
  •       S6     138  S2 AND S5
  • ?DS
  • Set     Items   Description
  • S1      54890   MATHEMATICS
  • S2       4055   FEAR
  • S3        116   S1 AND S2
  • S4     169200   1 AND 2
  • S5      58984   MATH OR MATHEMATICS
  • S6        138   S2 AND S5
  • ?
34
“/ENG” – Limit by Language
  • ? S S3 OR S6
  •              116  S3
  •              138  S6
  •       S7     138  S3 OR S6
  • ?S S7/ENG
  • >>>Term "ENG" is not defined in file 1 and is ignored
  •       S8     138  S7/ENG
  • ?
  •                                  http://library.dialog.com/bluesheets/html/bl0001.html


35
Suffix Searching
  • ?s mathematics searches for the word ‘mathematics’ in the basic index (usually includes the abstract field and may include fulltext!


  • ?s mathematics/ti searches for the word ‘mathematics’ in the title field


  • ?s mathematics/ti,de searches for the word ‘mathematics’ in the title OR descriptor fields
36
EXPAND Command in the Language Index (LA=)
  • ? E LA=ENGLISH
  • Ref   Items  Index-term
  • E1       19  LA=DUTCH
  • E2        1  LA=EDO
  • E3   752078 *LA=ENGLISH
  • E4        1  LA=ESPERANTO
  • .
  • .
  • .
  • E8       14  LA=FINNISH
  • E9     3292  LA=FRENCH
  • E10       3  LA=FULANI
  • E11       1  LA=GANDA
  • E12     728  LA=GERMAN
  •           Enter P or PAGE for more
  • ?
37
Select Command From an Expanded List
  • ? S E3
  •       S9  752078  LA='ENGLISH'
  • ?DS
  • Set     Items   Description
  • S1      54890   MATHEMATICS
  • S2       4055   FEAR
  • S3        116   S1 AND S2
  • S4     169200   1 AND 2
  • S5      58984   MATH OR MATHEMATICS
  • S6        138   S2 AND S5
  • S7        138   S3 OR S6
  • S8        138   S7/ENG
  • S9     752078   LA='ENGLISH'
  • ?
38
Combining Our Results with the Set of All English Language Records; What Have We Spent So Far?!!!
  • ? S S7 AND S9
  •              138  S7
  •           752078  S9
  •      S10     129  S7 AND S9
  • ?COST
  •        30may03 22:42:18 User556323 Session D1.2
  •             $1.22    0.817 DialUnits File1
  •      $1.22  Estimated cost File1
  •      $0.20  INTERNET
  •      $1.42  Estimated cost this search
  •      $1.46  Estimated total session cost   1.057 DialUnits
  • ?
39
The TYPE Command
  • ? T S10/8/1-5
  •  10/8/1
  • DIALOG(R)File   1:(c) format only 2003 The Dialog Corporation. All rts. reserv.
  • 01106676 ERIC NO.: ED458214 CLEARINGHOUSE NO.: TM032844
  • The Debate over National Testing. ERIC Digest.
  •   April 2001 (20010400)
  • DESCRIPTORS: Academic Achievement; *Achievement Tests; Elementary Secondary Education; *Federal Government; *Government Role; *National Competency Tests; Performance Based Assessment; *Politics; Test Construction; *Test Use
  • IDENTIFIERS: ERIC Digests; *National Assessment of Educational Progress
40
"10/8/2"
  •  10/8/2
  • DIALOG(R)File   1:(c) format only 2003 The Dialog Corporation. All rts. reserv.
  • 01098402 ERIC NO.: ED452367 CLEARINGHOUSE NO.: CE081628
  • Women and Minorities in High-Tech Careers. ERIC Digest No. 226.
  •   2001 (20010000)
  • DESCRIPTORS: Attitude Change; *Career Education; Change Strategies;
  •   Community Colleges; Computer Attitudes; *Education Work Relationship; Educational Change; Educational Environment; Educational Policy; Educational Technology; Elementary Secondary Education; Employed Women; Employment Patterns; Equal Education; Information Needs; Job Training; Leadership; Literature Reviews; *Minority Groups; Needs Assessment; *Nontraditional Occupations; Recruitment; Role Models; Sex Fairness; *Technical Occupations; Technological Advancement; Trend Analysis; Two   Year Colleges; Vocational Education; *Womens Education
  • IDENTIFIERS: ERIC Digests
  •  10/8/3
  • .
  • .
  • .
41
Details of the Type Command

  • ?T S10/8/1-5


  • “S10” is the set number that you are choosing


  • “8” is the format that you’re choosing for your output


  • “1-5” are the first five records of your resultant set S10


  • http://library.dialog.com/bluesheets/html/bl0001.html


42
Expand of the Basic Index in ERIC
  • ?
  • E GRADE 5


  • Ref   Items   RT  Index-term
  • E1     3486    3   GRADE 3
  • E2     4327    3   GRADE 4
  • E3     4463    3 *GRADE 5
  • .
  • .
  • .
  • E7     2561    5   GRADE 9
  • E8        1            GRADEAID
  • E9       44           GRADEBOOK
  • E10       1           GRADEBOOK PROGRAMS
  • E11      15          GRADEBOOKS
  • E12       1           GRADECALC
  •           Enter P or PAGE for more
  • ?
43
"? S E3"
  • ? S E3
  •      S11    4463  'GRADE 5'
  • ?S S10 AND S11
  •              129  S10
  •             4463  S11
  •      S12       1  S10 AND S11
  • ?
44
"? T S12/9/1"
  • ? T S12/9/1
  •  12/9/1
  • DIALOG(R)File   1:ERIC
  • (c) format only 2002 The Dialog Corporation. All rts. reserv.
  • 00490065 ERIC NO.: ED219235 CLEARINGHOUSE NO.: SE038286
  • Teaching Problem Solving; the Effect of Algorithmic and Heuristic Problem
  • Solving Training in Relation to Task Complexity and Relevant Aptitudes.
  •   de Leeuw, L.;
  • CORP. SOURCE: Free Univ., Amsterdam (Netherlands). (BBB20582)
  •   16pp.
  •   1982 (19820000)
  • EDRS Price MF01/PC01 Plus Postage.
  • LANGUAGE: English
  • DOCUMENT TYPE: 143 (Reports--Research)
  • RECORD TYPE: ABSTRACT
  • COUNTRY OF PUBLICATION: Netherlands
  • JOURNAL ANNOUNCEMENT: RIEDEC1982
  •    Sixty-four fifth and sixth-grade pupils were taught number series
  • extrapolation by either an algorithm, fully prescribed …
45
LOGOFF Command and Session Costs
  • ?LOGOFF
  •        30may03 22:47:24 User556323 Session D1.2
  •             $1.83    1.222 DialUnits File1
  •                $0.00  5 Type(s) in Format  8
  •                $0.00  2 Type(s) in Format  9
  •             $0.00  7 Types
  •      $1.83  Estimated cost File1
  •      $0.45  INTERNET
  •      $2.28  Estimated cost this search
  •      $2.32  Estimated total session cost   1.462 DialUnits


  • Return to logon page!


46
Formulating  Your Search Strategy
  • We will need to plan ahead …with Dialog searching costs money!!!  You’re paying “by the drink.”
    • What topic do we have in mind?
    • Where will we look for information?
    • What terms should we use?
    • How will we combine our terms?
47
 
48
 
49
Example BIOSIS Search
  • ? B 5
  •        02jun03 01:29:18 User556323 Session D2.1
  •             $0.00    0.241 DialUnits FileHomeBase
  •      $0.00  Estimated cost FileHomeBase
  •      $0.02  INTERNET
  •      $0.02  Estimated cost this search
  •      $0.02  Estimated total session cost   0.241 DialUnits


  • File   5:Biosis Previews(R)  1969-2003/May W4
  •        (c) 2002 BIOSIS


  •       Set  Items  Description
  •       ---  -----  -----------
  • ?
50
"?S"
  • ?S PREEN OR PREENED OR PREENING
  •              106  PREEN
  •               36  PREENED
  •              485  PREENING
  •       S1     602  PREEN OR PREENED OR PREENING
  • ?S MALLARD OR MALLARDS
  •             1680  MALLARD
  •              884  MALLARDS
  •       S2    2096  MALLARD OR MALLARDS



  • … There has got to be an easier way …


  • There is and we’ll learn it soon!
51
"? S ANAS (W)..."
  • ? S ANAS (W) PLATYRHYNCHOS
  •             4510  ANAS
  •             2866  PLATYRHYNCHOS
  •       S3    2845  ANAS (W) PLATYRHYNCHOS
  • ?S S2 OR S3
  •             2096  S2
  •             2845  S3
  •       S4    3515  S2 OR S3
  • ?S S1 AND S4
  •              602  S1
  •             3515  S4
  •       S5      21  S1 AND S4
  • ?S S5/ENG
  •       S6      13  S5/ENG
  • ?
52
SAVE TEMP and EXS
  • ?T S6/8/1-5
  • Those results looked good.  Save the search and begin in a different file executing the same strategy:
  • ?SAVE TEMP
  • Temp SearchSave "TD001" stored
  • ?b 185;exs
53
EXS in Action
  • File 185:Zoological Record Online(R)  1978-2001/Dec
  •        (c) 2001 BIOSIS
  • *File 185: File will be reloaded. Accession numbers will change.
  •       Set  Items  Description
  •       ---  -----  -----------
  • Executing TD001
  •               23  PREEN
  •                0  PREENED
  •              219  PREENING
54
"S1"
  •      S1     239  PREEN OR PREENED OR PREENING
  •              383  MALLARD
  •              322  MALLARDS
  •       S2     701  MALLARD OR MALLARDS
  •             3661  ANAS
  •             1668  PLATYRHYNCHOS
  •       S3    1656  ANAS (W) PLATYRHYNCHOS
  •              701  S2
  •             1656  S3
  •       S4    1687  S2 OR S3
  •              239  S1
  •             1687  S4
  •       S5       3  S1 AND S4
  • >>>Term "ENG" is not defined in file 185 and is ignored
  •       S6       3  S5/ENG
  • ?


55
Evaluating Search Results
  • Evaluate as you are searching
    • Search results that are unexpected can indicate a simple error in your data entry
    • Using AND à your result should be a smaller set
    • Using OR à your result should be a larger set


56
This is a Common Mistake
  • Suppose you had two sets.  Set S1 is the result of searching for ‘fear’ (4,055 records) and set S2 is ‘math or mathematics’ (58,984 records):


  • ?S 1 AND 2
  •           203089  1
  •           198285  2
  •       S4  169200  1 AND 2
  • ?
  • What has happened here???
57
Other Common Mistakes
  • Misspellings
  • Using the wrong set number from a large search history
  • Not catching when a limit or prefix has not worked
  • Typing out the wrong set … Matt