Personal tools
You are here: Home Knowledge Techie Central OpenACS presentation slide-11.html
Document Actions

slide-11.html

Click here to get the file

Size 6.7 kB - File type text/html

File contents

<html>
<head>
  <title>Functions of an RDBMS</title>
</head>
<body bgcolor="white">
<table width="100%">
  <tr>
    <td align="left"><h2>Functions of an RDBMS</h2></td>
    <td align="right"><a href="slide-10.html">Prev</a> | <a href="slide-12.html">Next</a></td>
  </tr>
</table>
<hr>

<blockquote>

In a world of individual users and small data sets the flat file (a
standard text file) would probably be a reasonable way to store all
your data.  Who is not charmed by a file that looks like:

<pre>
  
   Tristan Cohen, 123 Kukamunga Drive, Charleston, NC, USA
   Bob Lablaw, 251 Street Road, Yangtze-ville, ShuangHuan Provice, CHINA
   Elvis Presley, 12 Peanut Butter and Banana Road, Graceland, NE, JAPAN

</pre>

This file stores basic data about a group of three users and is
adequate for simple storage.  What happens to this file when you have
10,000 people trying to edit it at the same time?  Even worse, if
you've solved the delicate locking semantics issue ..  what happens
when the file is 10 million lines long and you want to find out the
address of 'Jim Beam'?  The RDBMS aims to solve this issue for you.
Keeping with the Open Source ideal the people at OpenACS focus their
efforts on working with PostgreSQL, the world's first and foremost
ACID-compliant RDBMS.  They also support ORACLE, the world's most
commonly deployed commercial RDBMS.<br/><br/>

People often come up to me on the street and ask, "Tristan, what is an
ACID-compliant RDBMS?".  Not wanting to write what has been already
said better before I would point them to Philip's description of ACID
compliance.  I have kindly included the excerpt here. <br/><br/>


  <b>Atomicity</b> <br/><br/>

  Results of a transaction's execution are either all committed or all
  rolled back. All changes take effect, or none do. That means, for
  Joe User's money transfer, that both his savings and checking
  balances are adjusted or neither are.

  <br/><br/>

  <b>Consistency</b> <br/><br/> 

  The database is transformed from one valid state to another valid
  state. This defines a transaction as legal only if it obeys
  user-defined integrity constraints. Illegal transactions aren't
  allowed and, if an integrity constraint can't be satisfied then the
  transaction is rolled back. For example, suppose that you define a
  rule that, after a transfer of more than $10,000 out of the country,
  a row is added to an audit table so that you can prepare a legally
  required report for the IRS. Perhaps for performance reasons that
  audit table is stored on a separate disk from the rest of the
  database. If the audit table's disk is off-line and can't be
  written, the transaction is aborted.<br/><br/>

  <b>Isolation</b> <br/><br/>

  The results of a transaction are invisible to other transactions
  until the transaction is complete. For example, if you are running
  an accounting report at the same time that Joe is transferring
  money, the accounting report program will either see the balances
  before Joe transferred the money or after, but never the
  intermediate state where checking has been credited but savings not
  yet debited.<br/><br/>

  <b>Durability</b> <br/><br/>

  Once committed (completed), the results of a transaction are
  permanent and survive future system and media failures. If the
  airline reservation system computer gives you seat 22A and crashes a
  millisecond later, it won't have forgotten that you are sitting in
  22A and also give it to someone else. Furthermore, if a programmer
  spills coffee into a disk drive, it will be possible to install a
  new disk and recover the transactions up to the coffee spill,
  showing that you had seat 22A. <br/><br/>

  <small>pilfered from <a
  href="http://philip.greenpsun.com/panda/database-choosing.html">here</a></small><br/><br/>

Having an ACID RBDMS is no guarantee that it is good for the web.
Even though this guarantees the integrity of your data, it says
nothing about retrieving data, or the speed of it's
implementation.<br/><br/>

SQL is a declarative language defined by an occasionally refined ANSI
standard.  Most programming languages are imperative, meaning that you
tell the computer how you want it to accomplish a particular task.
SQL is declarative meaning you say what you want and you leave the
database engine to figuring out how to get the information.  The
standard as defined by the kind folks at ANSI is constantly ignored by
each of the RDBMS manufacturers.  It is a large and difficult standard
to implement so admittedly this is a tough job for these
companies/organizations to keep up with.  That being said, this does
not stop them from updating the standard every few years.  No RDBMS is
entirely ANSI-92 (nee 1992), and they are even less ANSI-99 (nee 1999)
compliant.  Working under the assumption that standards are built to
be largely ignored companies like ORACLE actually implement their own
versions of SQL which have important features (eg. OUTER JOINS), but
they are implemented using custom syntax.  This means that while
ANSI-SQL is supposed to make most data models and queries cross
platform, in practice SQL must be rewritten for each RDBMS.<br/><br/>

Another major issue in RDBMS's is indexing.  Even if a hacker is
experienced in multi-process/thread environments and can implement a
proper ACID-compliant database, that doesn't mean they are familiar
the mathematical algorithms needed to make data retrieval fast.  Once
you have a million users who have each on average purchased 5 items it
becomes difficult to determine who has purchased 2 or more "Teddy
Ruxpins" in under a second.  The quality of a RDMS's indexing
implementation is all important if your site will ever have even a
medium amount of data.<br/><br/>

Another major issue facing someone trying to choose an RDBMS is it's
behaviour when dealing with simultaneous writes and reads.  One way to
implement transactions is to perform table wide locks.  This means
that when someone is writing to a table no one can write or read from
that table.  Sure this guarantees Isolation, but it also means that
when someone is doing a huge hour long transaction your database a
part of your database may be effectively locked down.  For a while
even major RDBMS's like Microsoft's SQL Server had this type of
locking.  Both ORACLE and PostgreSQL implement what is called
"row-level locking".  While a table's row is locked (which is only
while it is being written to) no one else can write to this row.  As
an individual row-write is fast, this means that the wait time is near
0.  These database implementations guarantee that readers don't wait
for writers, so go on, be inefficient, have lots of 1 hour
transactions, readers won't wait a second (although your redo logs may
become swollen, but that is another matter).

</blockquote>

<hr>
<a href="tristancohen@yahoo.com">tristancohen@yahoo.com</a>
</body>
</html>
by admin last modified 2003-04-28 10:15
 


View My Stats