Datenmodell Buch (BD): Unterschied zwischen den Versionen
Anna (Diskussion | Beiträge) K (add) |
Anna (Diskussion | Beiträge) K (corr.) |
||
Zeile 183: | Zeile 183: | ||
# load some publishers | # load some publishers | ||
− | + | LOAD DATA INFILE '/home/hh/wrk/databases/quin/publishers.tbl' INTO TABLE Publisher_tbl | |
− | + | FIELDS TERMINATED BY '|' | |
− | + | (name, Publisher_id); | |
# | # |
Version vom 12. Juni 2009, 07:59 Uhr
Die Beispiele stammen aus "The open source XML database toolkit" von Liam R. E. Quin
Die Aufgabe: Modellierung eines Buches für den Gebrauch in Datenbanken. Welche Charakteristika sind hervorzuheben und wie sind sie in einen Zusammenhang zu bringen?
Gewöhnlich hat ein Buch im einfachsten Fall diese Kennzeichen:
- eine Autorin
- einen Verlag
- ein Erscheinungsjahr
In traditionellen Zettelkästen ist für jedes Buch ein Eintrag angelegt worden, welcher diese Angaben auflistet. Das führt zu massiver Redundanz, weil z.B. der Name der Autorin auf jeder Karte wiederholt wird. Damit verbunden steigt die Fehleranfälligkeit durch falsche Angaben. (Im traditionellen bibliographischen Gebrauch werden die Fehler kompensiert.)
Nicht nur werden dieselben Autorinnen mehrfach erfasst. Es kommt auch vor, dass ein Buch von mehreren Autorinnen geschrieben ist. Und mehrere Bücher haben denselben Verlag. Wenn man pro Buch einen Zettel ausfüllt, liegt diese logische Struktur sozusagen brach. Bestimmte Arbeitsschritte werden dadurch kompliziert. Wenn sich zum Beispiel der Name der Autorin von "Schmidt" zu "Schmidt-Kowarzik" ändert, muss er auf allen Zetteln ausgebessert werden. Darum empfielhlt sich ein etwas differenzierteres Datenmodell. (Die in relationalen Datenbanksystemen geltenden Regeln sind in sogenannten "Normalformen" festgelegt.)
An der Stelle "Author" in der Buch-Tabelle steht nicht ein Autorenname, sondern diese Tabelle steht in Beziehung zu einer anderen Tabelle, welche die Autorinnen erfasst.
Das wird in SQL folgendermaßen erzeugt:
CREATE TABLE Book_tbl ( # id: unique id for identifying each book # (ISBNs vary by country, and this way we can # generate the ID automatically) Book_id INT NOT NULL PRIMARY KEY, # title, truncated if necessary Title VARCHAR(150) NOT NULL, # identify the publisher: Publisher_id INT REFERENCES Publisher_tbl(Publisher_id), # When the book was first published # we'd like to make it a DATE, but we usually only have # a year, not a day/month/year, so we just use a number: Date INT, Pages INT );
CREATE TABLE Author_tbl ( # identify the author; the Author_id refers to the Author table; # note that two authors with the same name might be entirely # different people. Author_id INT NOT NULL PRIMARY KEY, LastName VARCHAR(35) NOT NULL, FirstNames VARCHAR(40), Sortkey VARCHAR(25) # other things we might know about a writer: # gender CHAR(1), male/female/unknown/other # nationality # birthplace # birthdate # birthday (month, day) );
CREATE TABLE Publisher_tbl ( Publisher_id INT NOT NULL PRIMARY KEY, # name, e.g. Wiley, truncated if necessary because SQL, unlike XML, # is not really too happy about fields of unrestricted length: Name VARCHAR(50) # In the future, we could maybe add more information here... );
Der ganze Zusammenhang der Datencharakteristik wird gerne in Diagrammen dargestellt, welche die Beziehungen der konstruierten Entitäten erfassen.
Listeneinträge in die derart angelegten Tabellen sehen dann so aus:
Für books:
420|Consider Phlebas|80|1988 430|The Player Of Games|\N|\N 440|Use of Weapons|\N|\N 450|Canal Dreams|\N|\N 460|Espedair Street|80|1988 470|The Bridge|150|1987 480|Complicity|10|1993
Für authors:
Banks|Iain M.|Banks, Iain M.|130 Bannon|Mark|Bannon, Mark|140 Barker|M.A.R.|Barker, M.A.R.|150 Bates|Brian|Bates, Brian|160 Baudino|Gael|Baudino, Gael|170 Bayley|Barrington|Bayley, Barrington|180 Beagle|Peter|Beagle, Peter|190
Für publishers:
- Hier steckt ein Fehler drin. Wer merkt ihn? So wir diese Liste vom Datenbankprogramm nicht verarbeitet.
Doubleday||USA|290 Futura||USA|80 Grafton|London|UK|90 Guild/Bantam||USA|100 Legend||USA|110 Methuen||USA|120 Millennium|London|UK|290
Und das Skript zur Erzeugung der kompletten Dankenbank, inklusive Beispielsdaten, sieht so aus:
# create a "book" database # # Copyright, Liam Quin, 1999 # # see ../license for copying restrictions. # # If you are called emily, using mySQL on a computer named Igor, # you can do # mysql -h igor -u emily -p < bkcreate # For other databases, the command line options vary. # # If you are using Oracle, you may need to change # VARCHAR to VARCHAR2 everywhere. CREATE DATABASE books; USE books; # # First, an author CREATE TABLE Author_tbl ( # identify the author; the Author_id refers to the Author table; # note that two authors with the same name might be entirely # different people. Author_id INT NOT NULL PRIMARY KEY, LastName VARCHAR(35) NOT NULL, FirstNames VARCHAR(40), Sortkey VARCHAR(25) # other things we might know about a writer: # gender CHAR(1), male/female/unknown/other # nationality # birthplace # birthdate # birthday (month, day) ); # load some authors LOAD DATA INFILE '/home/hh/wrk/databases/quin/authors.tbl' INTO TABLE Author_tbl FIELDS TERMINATED BY '|' (LastName, FirstNames, Sortkey, Author_id); # # Next, a publisher CREATE TABLE Publisher_tbl ( Publisher_id INT NOT NULL PRIMARY KEY, # name, e.g. Wiley, truncated if necessary because SQL, unlike XML, # is not really too happy about fields of unrestricted length: Name VARCHAR(50) # In the future, we could maybe add more information here... ); # load some publishers LOAD DATA INFILE '/home/hh/wrk/databases/quin/publishers.tbl' INTO TABLE Publisher_tbl FIELDS TERMINATED BY '|' (name, Publisher_id); # # Now a book and its attributes: CREATE TABLE Book_tbl ( # id: unique id for identifying each book # (ISBNs vary by country, and this way we can # generate the ID automatically) Book_id INT NOT NULL PRIMARY KEY, # title, truncated if necessary because SQL, unlike XML, # is not really too happy about fields of unrestricted length: Title VARCHAR(150) NOT NULL, # identify the publisher: Publisher_id INT REFERENCES Publisher_tbl(Publisher_id), # When the book was first published # we'd like to make it a DATE, but we usually only have # a year, not a day/month/year, so we just use a number: Date INT, # Number of pages in some particular unspecified edition. # We might use that to # give thicker books bigger icons, perhaps. # Strictly speaking this might belong in an EDITIONS table, # but that's more detail than we need! Pages INT ); # load some books LOAD DATA INFILE '/home/hh/wrk/databases/quin/books.tbl' INTO TABLE Book_tbl FIELDS TERMINATED BY '|' (Book_id, Title, Publisher_id, Date); # # who wrote which book? CREATE TABLE Wrote_tbl ( Author_id INT NOT NULL REFERENCES Author_tbl, Book_id INT NOT NULL REFERENCES Book_tbl ); # load the relationship LOAD DATA INFILE '/home/hh/wrk/databases/quin/wrote.tbl' INTO TABLE Wrote_tbl FIELDS TERMINATED BY '|' (Author_id, Book_id); # # A character or place seen in a book CREATE TABLE Name_tbl ( Name_id INT NOT NULL PRIMARY KEY, Name VARCHAR(50) NOT NULL, # city|region|country|person|creature|plant|food|other Type CHAR(12) NOT NULL, Description VARCHAR(250) ); # load some names LOAD DATA INFILE '/home/hh/wrk/databases/quin/names.tbl' INTO TABLE Name_tbl FIELDS TERMINATED BY '|' (Name_id, Name, Type, Description); # # Where the name was referenced CREATE TABLE References_tbl ( Name_id INT NOT NULL REFERENCES Name_tbl, Book_id INT NOT NULL REFERENCES Book_tbl, # Record the role a particular name plays in a given book. # for example, Bilbo is "protagonist" in the Hobbit, # but in The Lord of the Rings he is secondary. Role VARCHAR(100) NOT NULL ); # load some references LOAD DATA INFILE '/home/hh/wrk/databases/quin/references.tbl' INTO TABLE References_tbl FIELDS TERMINATED BY '|' (Name_id, Book_id, Role); # done -- now we can use the database!