File contents
<html>
<head>
<title>Functions of an RDBMS</title>
</head>
<body bgcolor="white">
<table width="100%">
<tr>
<td align="left"><h2>Functions of an RDBMS</h2></td>
<td align="right"><a href="slide-10.html">Prev</a> | <a href="slide-12.html">Next</a></td>
</tr>
</table>
<hr>
<blockquote>
In a world of individual users and small data sets the flat file (a
standard text file) would probably be a reasonable way to store all
your data. Who is not charmed by a file that looks like:
<pre>
Tristan Cohen, 123 Kukamunga Drive, Charleston, NC, USA
Bob Lablaw, 251 Street Road, Yangtze-ville, ShuangHuan Provice, CHINA
Elvis Presley, 12 Peanut Butter and Banana Road, Graceland, NE, JAPAN
</pre>
This file stores basic data about a group of three users and is
adequate for simple storage. What happens to this file when you have
10,000 people trying to edit it at the same time? Even worse, if
you've solved the delicate locking semantics issue .. what happens
when the file is 10 million lines long and you want to find out the
address of 'Jim Beam'? The RDBMS aims to solve this issue for you.
Keeping with the Open Source ideal the people at OpenACS focus their
efforts on working with PostgreSQL, the world's first and foremost
ACID-compliant RDBMS. They also support ORACLE, the world's most
commonly deployed commercial RDBMS.<br/><br/>
People often come up to me on the street and ask, "Tristan, what is an
ACID-compliant RDBMS?". Not wanting to write what has been already
said better before I would point them to Philip's description of ACID
compliance. I have kindly included the excerpt here. <br/><br/>
<b>Atomicity</b> <br/><br/>
Results of a transaction's execution are either all committed or all
rolled back. All changes take effect, or none do. That means, for
Joe User's money transfer, that both his savings and checking
balances are adjusted or neither are.
<br/><br/>
<b>Consistency</b> <br/><br/>
The database is transformed from one valid state to another valid
state. This defines a transaction as legal only if it obeys
user-defined integrity constraints. Illegal transactions aren't
allowed and, if an integrity constraint can't be satisfied then the
transaction is rolled back. For example, suppose that you define a
rule that, after a transfer of more than $10,000 out of the country,
a row is added to an audit table so that you can prepare a legally
required report for the IRS. Perhaps for performance reasons that
audit table is stored on a separate disk from the rest of the
database. If the audit table's disk is off-line and can't be
written, the transaction is aborted.<br/><br/>
<b>Isolation</b> <br/><br/>
The results of a transaction are invisible to other transactions
until the transaction is complete. For example, if you are running
an accounting report at the same time that Joe is transferring
money, the accounting report program will either see the balances
before Joe transferred the money or after, but never the
intermediate state where checking has been credited but savings not
yet debited.<br/><br/>
<b>Durability</b> <br/><br/>
Once committed (completed), the results of a transaction are
permanent and survive future system and media failures. If the
airline reservation system computer gives you seat 22A and crashes a
millisecond later, it won't have forgotten that you are sitting in
22A and also give it to someone else. Furthermore, if a programmer
spills coffee into a disk drive, it will be possible to install a
new disk and recover the transactions up to the coffee spill,
showing that you had seat 22A. <br/><br/>
<small>pilfered from <a
href="http://philip.greenpsun.com/panda/database-choosing.html">here</a></small><br/><br/>
Having an ACID RBDMS is no guarantee that it is good for the web.
Even though this guarantees the integrity of your data, it says
nothing about retrieving data, or the speed of it's
implementation.<br/><br/>
SQL is a declarative language defined by an occasionally refined ANSI
standard. Most programming languages are imperative, meaning that you
tell the computer how you want it to accomplish a particular task.
SQL is declarative meaning you say what you want and you leave the
database engine to figuring out how to get the information. The
standard as defined by the kind folks at ANSI is constantly ignored by
each of the RDBMS manufacturers. It is a large and difficult standard
to implement so admittedly this is a tough job for these
companies/organizations to keep up with. That being said, this does
not stop them from updating the standard every few years. No RDBMS is
entirely ANSI-92 (nee 1992), and they are even less ANSI-99 (nee 1999)
compliant. Working under the assumption that standards are built to
be largely ignored companies like ORACLE actually implement their own
versions of SQL which have important features (eg. OUTER JOINS), but
they are implemented using custom syntax. This means that while
ANSI-SQL is supposed to make most data models and queries cross
platform, in practice SQL must be rewritten for each RDBMS.<br/><br/>
Another major issue in RDBMS's is indexing. Even if a hacker is
experienced in multi-process/thread environments and can implement a
proper ACID-compliant database, that doesn't mean they are familiar
the mathematical algorithms needed to make data retrieval fast. Once
you have a million users who have each on average purchased 5 items it
becomes difficult to determine who has purchased 2 or more "Teddy
Ruxpins" in under a second. The quality of a RDMS's indexing
implementation is all important if your site will ever have even a
medium amount of data.<br/><br/>
Another major issue facing someone trying to choose an RDBMS is it's
behaviour when dealing with simultaneous writes and reads. One way to
implement transactions is to perform table wide locks. This means
that when someone is writing to a table no one can write or read from
that table. Sure this guarantees Isolation, but it also means that
when someone is doing a huge hour long transaction your database a
part of your database may be effectively locked down. For a while
even major RDBMS's like Microsoft's SQL Server had this type of
locking. Both ORACLE and PostgreSQL implement what is called
"row-level locking". While a table's row is locked (which is only
while it is being written to) no one else can write to this row. As
an individual row-write is fast, this means that the wait time is near
0. These database implementations guarantee that readers don't wait
for writers, so go on, be inefficient, have lots of 1 hour
transactions, readers won't wait a second (although your redo logs may
become swollen, but that is another matter).
</blockquote>
<hr>
<a href="tristancohen@yahoo.com">tristancohen@yahoo.com</a>
</body>
</html>