FileMaker has long been the best-selling Mac flat-file database system, because it surprisingly powerful, yet easy enough to use that you don't have to have a degree in computer science to work with it. The user interface in FileMaker Pro 2.1 was so good that it was hard to imagine how Claris could improve it. (Can you say that about your word processor?) But they did.
FileMaker Pro 3 is a major improvement over the version it supersedes (2.1). But this is saying too little, because version 3 is not just better, it is, in some fundamental way, a different kind of application. A relational database application differs from a flat-file system at least as much as a page-layout program differs from a word processor. Remarkably, not only does FileMaker Pro retain all of the features of FileMaker Pro 2.1, it actually looks and feels enough like its predecessor that experienced users of the earlier version will find the transition to version 3 effortless. They will not have to pay any attention to the relational capabilities of FileMaker Pro at first, if they don't want to. In fact, relationality is so gracefully added to FileMaker Pro that many users will be able to ignore it altogether, not just at first, but forever. To put it differently, this is the first RDBMS (relational database management system) for the Mac that is so user-friendly, it makes sense even for very simple flat-file databases. Nevertheless, FileMaker Pro 3 makes available to users a brave new world of possibilities.
What is almost as useful is that you can now trap your own errors as a script runs. Say the script performs a predefined Find, but finds no records. With the new error-checking step, you can intercept FileMaker Pro's built-in alert ("No records match this request," with the options 'Cancel' or 'Modify Find') and replace it either with your own alert or simply branch to another part of the script, using conditional branching.
With the new "set field" scripting command, you can change the data in a field without having to use other fields. In FileMaker Pro 2.1, if you wanted to increase the value of field 'foobar' by 1, you have to have a calculation field 'nextFoobar' hidden on the layout with the formula = "foobar + 1"; the script would then copy the value in 'nextFoobar' and paste it into 'foobar'. Now you can simply use the "set field" command, which allows you to define a calculation in the script, e.g. "set field 'foobar' to 'foobar' + 1".
You can now add comments to your scripts, which is especially helpful when you are writing complex scripts. Unfortunately it is still not any easier than it used to be to view some of the parameters for script commands such as Find definitions or the names of subscripts to be called from another database. For these details, you still have to print the script out. And you still cannot cut or copy selected steps in a script (for example, so you can paste them into another script).
Dear «firstName», how are you and «significantOther» these days in «city»? I know it's been «yearsSinceLastSeen» years since I last ran into you, but I'm running for Congress now and since I heard that you and «significantOther» are pulling down about «estimatedAnnualIncome» these days, I thought I'd write to say Howdy!
Assuming the data were in the proper fields, this might actually print out as:
Dear Lem, how are you and Velma these days in New Braunfels? I know it's been 27 years since I last ran into you, but ....
Note that the lines of text close up and reflow properly if the data in the field takes up less space than you have allotted to the field. You can have a conditional merge, but you will have to use calculation fields and/or scripting steps to evaluate the conditions for you beforehand. For example, not everybody has a significantOther. So you might need to create a field for the merge example above, call the field "ifSigOther", and give it a formula like this:
IF (isEmpty(significantOther), "", "and " & significantOther)
This means, if the field significantOther is empty, then leave this space blank; otherwise put "and" in front of the name in significantOther (creating a result like "and Velma"). So say that Velma was killed in a mule stampede and you removed her from the significantOther field, and assuming that you substituted the field "ifSigOther" in the merge example above, you'd end up with this:
Dear Lem, how are you these days in New Braunfels? I know it's been 27 years since I last ran into you, but I'm running for Congress now and since I heard that you are pulling down about $197,000 these days, I thought I'd write to say Howdy!
For easy, straightforward merges, this will do the trick, and it's a darned sight easier than exporting your data from FileMaker Pro, importing the text document into your word processor and massaging the data heavily there.
A relation is a link between two databases that permits one of them (the master database) to borrow and use data from the other. The FileMaker Pro documentation calls the latter file the 'related' database; I prefer to call it the 'servant' or 'lending' database. Since I will discuss in some detail both the concepts and their practical applications in the appendices that follow, I will be fairly brief here, and focus (a) on the ways in which relationality changes the appearance of FileMaker Pro and (b) on the weaknesses of Claris's implementation of relationality in this version.
To set up a relation between two databases, you do something that is very similar to what you do to define a lookup. When you defined a lookup in FileMaker Pro 2.1, you told the program something like this: "When I type a social security number into the field socSecNum in this database and it matches a number in the field socSecNum in the 'Employees' database, copy the name that goes with that number from the field 'empName' there into the field 'empName'here." The relation and the lookup have this much in common: a link is established between two files that allows data to flow from one to another, and the flow of that data is triggered by a match between data in defined "match fields."
But there are important differences between lookups and relations. First, a lookup is a one-time event, while a relation is, well, an ongoing relationship. Once the lookup has pasted the source data into the target field, it is done forever (unless you trigger the lookup again, e.g by retyping the social security number if you typed it incorrectly the first time). You can go to the source file and change data there without updating information that has formerly been copied from that file via lookups. In other words, a lookup is really just a special way of copying data from one place and pasting it in another. A relation, on the other hand, resembles the Mac's publish-and-subscribe protocol. When you update the source file, the master file (which is borrowing the source file's data) is automatically updated. If you got Susan Smith's name from the Employees database via a lookup on the match fields for social security numbers, and subsequently Susan changed her last name to Brown, she would remain "Smith" in t he master database unless you did a relookup. But if you had established a relation to get her name, then changing her name in the source file would automatically change it in the master file.
The reason for this is really the second major difference between lookups and relations. When you use a relation, the source data does not come into the master database at all. It is merely displayed there. A lookup actually copies the source data, then pastes it into the master file, so the data now exist in two places. So in some sense it is not correct to say that the data in the master file are 'updated' when you change them in the source file. Because the data are never 'in' the master file at all.
Now when you defined a lookup in FileMaker Pro 2.1, you simultaneously told FileMaker Pro what you wanted to do with it (e.g. copy an employee's name from the source file to master file). But in FileMaker Pro 3, defining the relation is one thing -- it involves identifying the servant/source file, and naming the match fields -- but using it involves another step. To use the relation, you have to place fields from the servant database on a layout in the master file. You do this by using the Field tool, the same way you placed fields on a layout in version 2.1 -- except that in version 3, you can place not only the fields defined for the current database, but also fields in any and all related (lending) databases. If you wish to display data borrowed from a single record in the servant database -- for example, the name that corresponds to a unique social security number -- you simply place the field on the layout. But if you wish to display data borrowed from several records -- for example, the names of all books whose author is "William Shakespeare" -- then you must first place a special element called a 'portal' in the layout, and then you place the related fields inside the portal. A portal looks like a repeating field when you place it, because it has a number of rows (you define how many), one row for each related record's data. But you can also include a scroll bar -- something you cannot do with a repeating field. The scroll bar makes it possible to view all the borrowed data in the portal, even if the portal does not have enough rows to show it all at a glance. In this sense, a portal is infinitely expandable.
And that is about it for the basics. Some of the complexities that are involved in setting up relations in programs like Helix Express and Fox Pro are missing from FileMaker Pro. You just establish the relation, then indicate in the master database where you want the borrowed data to be displayed.
Now there are a couple of problems with the way FileMaker Pro 3 handles portals and the data they contain.
First, the data are displayed unsorted (that is, in creation order). Apparently FileMaker Pro 3 runs to the lending database's data tables and grabs all the matching records one at a time, starting from the beginning of the table. This is an annoyance, but not a fatal flaw, since there are a couple of ways to work around this limitation. One involves creating a script that switches to the lending database, sorts the data, EXPORTS it to another ancillary database or a tabbed-text file, then REIMPORTS it (making sure you overwrite the data still there, so you don't end up with duplicates). The order in which records are imported becomes their creation order, and if you import them so that they are sorted to begin with, then (to put it paradoxically), they will be sorted even when they are not sorted. Since the (technically) unsorted order has changed in the lending database, data will display (logically) sorted in the master (borrowing) database's portal.
The other workaround is too complicated to describe here, although I will be happy to explain it to anyone who is interested. It involves using a self-relation and lookups in the source file to get the unsorted data in field A into field B in sorted order. I got this idea from Dennis Hayes on CompuServe.
The second problem is more serious: portals don't slide. When you place fields in a portal, you must make them big enough to accomodate all the data you think they will have to display. If some of the records displayed in the portal require three lines of text, while others require only one, when you print the master record out, the gaps created by the blank lines in the shorter related records will not be filled in, as they would if the data were owned by the master database and displayed in a repeating field.
I go over these ideas in detail and provide examples in the appendices below.
What is a "relation"? In its simplest and least technical sense, it is a way of having one database display and make use of data from another database without actually storing that data. This is what a relation is in FileMaker Pro.
Let's reconsider the card catalogue at the library. If you were given the task of putting the library's card catalogue into a computer database, you could duplicate the card catalogue's original structure by creating three entirely separate databases, for authors, titles, and subjects. But it would be much smarter to create a relational database or suite of databases to handle this job. By doing so you would create a system that automatically kept all its part up to date, and which took much less space to store on disk.
Consider the following sketch of what the suite might look like. There are three databases, named AUTHOR, TITLE and GENRE. (I'm using genre here because I find it easier to use as an example than "subject," but the principle is the same.)
First, we would create the following fields for each database:
Note the equals signs between AUTHOR:name and TITLE:author. These have been defined as "match fields." When you enter data in AUTHOR:name that matches (is identical to) the data in TITLE:author, a link is established between the records in the two databases that share this data. And when this link is established, one database can "borrow" the rest of the data from the other. You could, for example, while looking at a record in AUTHOR, see a list of all the books that the library has by this author. This is what the arrow pointing from TITLE to a blank in the AUTHOR columns represents: the movement of borrowed data. Note that AUTHOR has no field 'bookTitle' defined, because it is unnecessary. AUTHOR does not need a field to store this data, because it doesn't actually possess it.
(We could have given the match fields in AUTHOR and TITLE the same name in both databases, e.g. 'authorName', and usually doing so will make it easier for you to keep track of your relations. I deliberately gave them different names to make the point that what is being matched is the data contained in the fields, not the fields themselves or their names.)
Each of the databases represented by the first two columns in the diagram above is both a borrower and a lender, which is why some of the arrows point one way, and others point in the opposite direction. Now AUTHOR and TITLE are not "trading" data. They do not really have a two-way relationship. Instead, AUTHOR is lending data to TITLE, and in an entirely distinct transaction, TITLE is lending data to AUTHOR. It is not like what happens when the editor of the New York Times calls the editor of the Wall Street Journal on the phone and they have a conversation. Instead, it is like when the editor of the Times subscribes to the Journal, and the editor of the Journal subscribes to the Times. But it is worth noting that two databases can share data in this way. The relationship between TITLE and GENRE, on the other hand, is one-sided: GENRE borrows data from TITLE, but lends nothing to TITLE in return. (Like when the Editor of the National Enquirer subscribes to the New York Times.)
Let's put a little data in these databases, just to make the example more concrete. In AUTHOR, let's create just one record for the moment:
AUTHOR [record #1]
name = "William Shakespeare"
authorDates = "1564-1616"
language = "English"
Now let's create a record in the TITLE database:
TITLE [record #1]
author = "William Shakespeare"
bookTitle = "The Tempest"
genre = "Romance"
Let's review what we are doing. We want records in the TITLE database to display the author's birth and death dates, but we want that data in the AUTHOR database as well. We could keyboard these dates in for each new record in both databases. But this is unnecessarily duplicative, not only because the data gets stored in two different databases, but because if we have fifty or more different records for Shakespeare in TITLE, we are going to have to reenter that data for each one of these records. In other words, this approach (the one used by the physical card file in the old library) wastes both space (for storage) and time (for data entry).
Now if we were working in FileMaker Pro 2.1, we could create a lookup. Once Shakespeare's dates have been entered in the source file (AUTHOR), then when "William Shakespeare" is keyboarded into the 'author' field in TITLE, the lookup is triggered, and data from Shakespeare's record in AUTHOR is copied into field designed to hold that data in TITLE. In this situation, a lookup -- from the end user's point of view -- will achieve a result that appears identical to the result achieved by a relation.
But appearances can be deceiving. The lookup has three problems.
First, while it eliminates the waste of data-entry time that keyboarding those dates into each new TITLE record would represent, it still wastes storage space. Shakespeare's dates get stored as separate magnetic impulses for each and every record. Moreover, you have to create a field in TITLE to hold this data.
Second, a lookup is only good for "many-to-one" relations; for "one-to-many" relations, on the other hand, it is useless. In other words, it will look up Shakespeare's dates when you create a new TITLE record just fine. But you cannot use a lookup to grab the data from all the TITLE records that represent the Bard's plays and copy them into, say, a long repeating field in AUTHOR. The best that a lookup originating in AUTHOR can do is get the title data from the first record it encounters that has "William Shakespeare" in the 'author' field. Given the data we entered above, that would be The Tempest. The records for all of Shakespeare's other plays will be ignored.
The third problem with a lookup is that it is like a photocopying machine. It saves you some trouble, sure, but you can only be absolutely sure that it is accurate at the moment you remove the warm copy from the photocopying machine. Five minutes later, the photocopy's accuracy begins to be questionable, because the original may already have been modified. In our database example, say you had entered fifty records for works by Shakespeare and each dutifully looked up and copied his life dates from AUTHOR -- and then you discover that you had initially entered his life dates incorrectly in AUTHOR! Even if you correct his dates in AUTHOR, you still have fifty records in TITLE that are wrong. Maybe you'll remember to switch to TITLE, find all those records and do a relookup to "freshen" the dates. But you are busy when you notice this error, so you put it off to later, and then....
In this situation, a relation is probably going to be superior to a lookup, although it may not be as much better as you might think at first. A relationalways eliminates the first problem with lookups -- the waste of storage space. But if you have a 2Gb hard drive, saving storage space may not be one of your primary concerns. The second limitation of the lookup -- that it is useless for one-to-many relationships -- is, of course, not always a problem, and would not be a problem here if we had no interest in viewing a list of the author's works in the author's database. (In FileMaker Pro 2.1, you would have created a script that copied the author's name in AUTHOR, then switched to TITLE, went into Find mode, and pasted the name into the field 'author'.)
The essential advantage of the relation, then, and its essential advantage over the lookup, is that the relation is vital and ongoing, where the lookup is not so much a relationship as a one-time event. You want and need relations, rather than lookups, when you know that you will always want the most up-to-date data possible. (We will return to this point in a minute.) But always keep in mind that you do not always want the freshest data. We are not talking about dairy products here. We are talking about data, and sometimes there is a historical significance to the data that you don't want to alter. Say you invoiced a customer in November when the widgets you sold him cost $1.95 apiece, then two weeks later, you raise the price of widgets. You want the new price to be reflected in December's invoice, but you want the old price to stay put in your records for November. If they don't, you are going to lose a customer when you call him up and tell him the check he sent in was not large enough. Moral: Lookups and relations are really quite different tools and each has its own uses.
Presume now that we have spent a few minutes creating more records for works by Shakespeare in the TITLE database. Now we can turn to the AUTHOR database and create a relation whereby it borrows data from the TITLE database. This relation coincidentally is based upon the same match fields that the AUTHOR --> TITLE relation was based on: the fields "authorName" found in both databases. When there is a match between the two "authorName" fields, we would like to see displayed in the AUTHOR database a list of all the records in TITLE that represent works by each author. This is a one-to-many relationship; the one record in AUTHOR for William Shakespeare is going to display the data in the bookTitle field for every record in TITLE whose author is William Shakespeare. While we're at it, it would also be nice to see the genre information for each work. So in AUTHOR, when viewing the record for William Shakespeare, we want to see something like the following:
AUTHOR [record #1]
authorName = "William Shakespeare"
authorDates = "1564-1616"
language = "English"
[bookTitle = The Tempest, genre = Romance]
[bookTitle = King Lear, genre = Tragedy]
[bookTitle = Macbeth, genre = Tragedy]
[bookTitle = Hamlet, genre = Tragedy]
[bookTitle = Richard III, genre = History]
[bookTitle = Taming of the Shrew, genre = Comedy]
I've used brackets here to indicate that these items are borrowed, not owned by the AUTHOR database itself. In FileMaker Pro, you use a "portal" to create a one-to-many relation of this sort. You use the Portal tool to draw the portal on the layout, then use the Field tool to indicate which fields from the related database you want to view in that portal.
As I said earlier, there are actually two distinct relations between AUTHOR and TITLE, one in which data are lent by T to A, and another in which data are lent by A to T. Like many other relational database programs, FileMaker Pro permits the user to edit borrowed records right in the master database, without requiring you to switch to the source database where the records actually live. Say you're looking at the record for William Shakespeare in AUTHOR and you notice that one of the records reads "Venus and Adonus". You can correct the spelling to "Venus and Adonis" in the AUTHOR database, without having to switch to TITLE and find that record.
When we set the AUTHOR database up originally, I included "language" as a field to make a point about efficiency. It might make more sense to put the field "language" in TITLE, since we are probably more interested in the language the book was written in than the language that the author spoke as his native tongue. Putting "language" in TITLE would solve the problem presented by certain multilingual authors like Vladimir Nabokov and John Milton (who wrote more in Latin than in English!). But ninety-nine percent of all authors write in a single language. If "language" is a field in TITLE, you will have to keep entering "English" for each of Shakespeare's works, for all of Faulkner's novels, and for each mystery by Agatha Christie. It makes more sense to assign the field 'language' to AUTHOR than to TITLE, not because it is the author's language that you are interested in, but because this assignment is more efficient. If you are going to design good relational systems, you have to learn to think about things like this. (And what do you do with Nabokov? You could create a special field in TITLE to be used only when the book named in a given record is published in a language other than the author's mother tongue.)
There are only a couple things to say about the GENRE database. It has a single field of its own -- "genre" -- which serves as the match field for a relation with TITLE. Searching in GENRE for "Tragedy" should produce a list of records including the Shakespearean tragedies mentioned above, along with plays by Sophocles and Racine. Once again, this is a one-to-many relation. But its relation with TITLE is one-way; it borrows data from TITLE, and lends nothing in return.
Now it is worth asking if creating such a database makes any sense. After all, you can search in TITLE on the "genre" field and do a report there. The reason the old library card catalogue had a separate Subject listing was that the card catalogue had no filtering capability. You could NOT search through the Title catalogue and have it hide all the records except those whose subjects met your search criteria. But this is precisely the sort of thing you can have a digital catalogue do. So when you transfer your database from hard-copy form to digital form, you may want to rethink it from the ground up. (It might make sense to create a separate database for GENRE, however, if you added a relation for "language"....)
When I started this project, before even launching FileMaker Pro, I spent some time with a pencil and paper, making notes about what I wanted the system to do for me. I wanted to be able to store the grades for each of my classes, of course. I wanted the system to be flexible enough to permit me to modify the kinds of assignments I tracked in each class. But I thought it would be nice if I could view the records for all classes that a particular student has taken with me (this being useful especially when I am asked to write recommendations). Then I started drawing diagrams to show how these tasks would be accomplished with databases. I needed to figure out how many databases were necessary, what would constitute a record in each database, and what sort of relations between them (or lookups) were both possible and useful.
(This sort of paper-and-pencil planning is extremely important when you are designing any database, and especially when you are contemplating something relational. Don't be suckered by FileMaker Pro's ease of use into thinking that it is smart to just launch FileMaker, define a few fields and start entering data.)
It was obvious that I was going to want to have a database for STUDENTS, and another database for CLASSES. The first notes I jotted down looked something like this (database names in all caps, field names in lowercase):
|student name||class name|
Unfortunately this got me nowhere, because there is no match field to establish the relation. So I added one:
|STUDENTS (lending)||CLASSES (borrowing)|
|student name --->|
|class name||=||class name|
In other words, when there is a match between "class name" in STUDENTS and "class name" in CLASSES, a relation is established, and I will then be able to view a list of all the students enrolled, say, in Elementary Latin in the CLASSES database (a one-to-many relation, that is, one class with many students). The problem with this is that it was too unidirectional.
I improved it a little by drawing on one of FileMaker Pro 3's new features -- the ability to make a value list for a field that is based on the index of another field, even a field in another database! To the field 'class name' in STUDENTS I assigned a value list derived from the field of the same name in CLASSES. That way, when I was working in STUDENTS, after creating a new record for a student, I could enter the 'class name' field and view a list of all the classes that were currently defined in the CLASSES database. This is technically *not* a relation, because if one of the records in CLASSES was altered or deleted, that change would not be reflected in the STUDENTS database. But it made life a little easier.
But I wanted more. I did not want to have to tell the database which classes a given student was enrolled in; on the contrary, that was what I wanted the database to tell me. It became clear that I could not do everything I wanted with just two databases. You run into logical problems when you try to define direct many-to-many relationships; they quickly become circular. One database asks the other, "Show me what classes this student is enrolled in!" and the other database replies, "You show me first what students are enrolled in each of these classes!"
The solution to this problem was to create an intermediate database, one that contained records representing the intersection between STUDENTS and CLASSES. That is, each record represented one student's enrollment in one class. The relations between the three databases could now be described as many-to-one-to-many. Like this:
|student name = student name --->|
|<--- class name||= class name|
This arrangement has a cost: all the data in the fields 'student name' and 'class name' are now duplicated in the database I have named GOBETWEEN. But the cost is worth it, because this duplication enables me to break the data from STUDENTS and CLASSES up, and makes possible matches that go both ways. Now, each student's record in STUDENTS will display a borrowed list of all the classes he is enrolled in; and each class record in CLASSES will display a list of all the students enrolled. Even better, because I defined my relations this way, FileMaker Pro 3 makes it possible for me to create records in the database GOBETWEEN while I'm working in either STUDENTS or CLASSES.
At this point it occurred to me that the system would not work properly if I had two students with the same name. It does not happen often, but it happens. The records in the STUDENTS database must be unique, and that pretty much means that they have to be built around student ID (social security) numbers, rather than names. This complicates things a little:
|student ID||= student ID --->|
|student name --->|
|<--- class name||= class name|
But now the CLASSES database displays student ID numbers, rather than names. That's not helpful! I can usually remember my own social security number, and on a good day I can remember my wife's, but that's about the limit. Two solutions occurred to me. First, I could use a lookup in GOBETWEEN to grab the student name from STUDENTS when there was a match on the "student ID" fields, and copy it into GOBETWEEN; then CLASSES could borrow the name from GOBETWEEN. But a lookup, remember, is not a real, vital relation. It is a one-time, pseudo relation. If student Jane Doe married Joe Blow in her sophomore year and her name changed, I'd lose track of her. The solution was to create a calculation field in GOBETWEEN with this formula:
student name calc = STUDENTS::student name
This only works if GOBETWEEN and STUDENTS are related, but since they are, it is possible to "capture" that borrowed data in a calculation field and give GOBETWEEN some limited ownership rights over it -- enough to permit GOBETWEEN to lend this data to CLASSES. And since this is a calculation field, if the data it draws upon change, it will change too. Now the suite's structure looks like this:
|student ID||= student ID||--->|
|student name --->|
|student name calc --->|
|<--- class name||=||class name|
Okay, that establishes the basic students-to-classes relationships. But what about assignments and grades? It looked like having a "grades" field in GOBETWEEN made the most sense; that would enable me to borrow that data in STUDENTS. But I didn't want this system to track just a single grade (which would be easy enough). I wanted it to track all the grades for each student in each class. And ideally, I wanted it to be able to accomodate a different set of graded assignments and percentage weights for each class. Here is what I came up with:
|student ID||= student ID||--->|
|student name --->|
|student name calc --->|
|<--- class name||= class name|
|<--- assignments (r)|
|<--- weights (r)|