Jargon and concepts everyone who's taken a database course must know. Plagarized
from various sources.
- a shared, integrated collection of data and metadata about
the data. Logically related data stored in a single data repository,
avoiding scattered islands of information.
a list of information.
A database, also called a database server, is the key to solving the
problems of information management. In a relational database,
collections of related information are organized into structures
A table is a two-dimensional object made up of columns and rows.
Each table contains rows (records) that are composed of
The tables contain common columns that allow data from one table to be
related to another.
Tables are the basic database objects and contain all the user
data. When creating a table, it is important that you define that data
that you want to store in the table. You need to specify the datatype
of the data and any restrictions on the range of values.
The tables are stored in the database in structures
called schemas, which are logical structures/containers of data where database
users store their objects.
databases objects include tables, indexes, views, sequences,
synonyms, functions, procedures, packages, triggers.
A schema is a collection of database objects. A schema is owned by a
database user and has the same name as that user, such as the HR
schema. Schema objects are logical structures created by
users. Objects can define areas of the database to hold data, such as
tables, or can consist of just a definition, such as views.
Every object in the database belongs to just one schema and has a
unique name within that schema. Generally, you place all of the
objects that belong to a single application in the same schema.
you need to specify a datatype for the data that is used by the
object. When you create a table, you must specify a datatype for each
of its columns. A datatype associates a fixed set of properties with
the values that can be used in a column, or in an argument of a
procedure or function. Each column value and constant in a SQL
statement has a datatype, which is associated with a specific storage
format, constraints, and a valid range of values. The most common
datatypes are character, numeric, and date.
DBMS (database management system)
- software that manages the databases. Stores data structures,
relationships between those structures, and access paths to the structures.
Well-known vendors: Oracle, IBM DB2, Microsoft SQL Server, MySQL,
- all data is visible to the user as tables; all operations work on
Codd's 12 rules are the semiofficial definition of a
relational database; ideal goal.
Vendors compete on performance, features, development tools, support,
apps that use the DBMS.
Uniquely-named tables. Each with one or more named columns of particular data
types arranged in specific left-to-right order.
Each table with zero or more rows each
containing single data value in each column. Rows are unordered.
Tables are related/linked by common data values in the primary key and
foreign key columns of a parent table and child table, respectively.
Table where every row is different from all other rows (due to primary
key) is a mathematical relation, whence the name
(not because the tables are "related" to other tables).
Table is a subset of Cartesian product of columns possible values.
Table stores information about one kind of thing.
Table name should be singular.?
Row contains information for only one item. Each item's information is
stored in just one row.
- description of the data characteristics and the relationships
that link the data. Stored in the database just as regular data is.
Information about tables, columns, key, constraints etc. Can be
accessed by front-ends.
- Spreadsheet, Database vs
millions/billions of rows
relationships/links between tables
data quality constraints e.g. length limits, non-empty, within ranges,
sensibly related to other tables' data
SQL (Structured Query Language)
relatively simple, intuitive, English-like, high-level langauge that handles most
aspects of data manipulation. Supported by all major RDBMS.
All database operations are performed using SQL statements.
database sublanguage used to control ALL of the functions that a DBMS
provides for its users:
database access, nonprocedural language. Users describe in SQL what
they want done, and the SQL language compiler automatically generates
a procedure to navigate the database and perform the desired task.
- data definition languageDDL
- data manipulation language and retrieval DML
- data access control language DCL
- sharing by concurrent users
- data integrity constraints
SQL is a interactive query, database programming, database
administration, client/server, Internet data access, distributed,
database gateway language.
Useful for doing: database admin, programming, ad-hoc querying.
spur of the moment, interactive query made possible by DBMS's
SQL command categories:
a nonprocedural programming language that enables you to access a
relational database. Using SQL statements, you can query tables to
display data, create objects, modify objects, and perform
administrative tasks. All you need to do is describe in SQL what you
want done, and the SQL language compiler automatically generates a
procedure to navigate the database and perform the desired task.
data definition: create alter drop rename truncate
data manipulation: insert update delete
data retrieval: select
data control: grant revoke
transaction control: commit rollback savepoint ("undo")
set/declarative processing: tell the database what is required, not
how each piece of data should be handled.
SQL dialects per vendor are extensions into programming
language constructs: variables, loops, error-handling, arrays, etc.
e.g. Oracle PL/SQL
SQL2003 is ANSI/ISO standard.
as a language, has identifiers, literals, operators, reserverd words / keywords.
2 categories of DBMS by use:
production/transaction/OLTP (online transaction processing)
- most-encountered DB. Many short transactions with response time important. Emphasis on data
integrity, data consisitency, operational speed.
Originally, RDBMS was considered too slow for OLTP (IBM's heirarchical
IMS dominated; still used).
data warehouse/OLAP (online analytical processing)/decision support
- designed for
data mining; large; built from production database historical
and aggregated data; for large, complex
queries at middle/upper management level to make tactical/strategic
decisions. "decision support systems".
Often: time-series data, statistical summarizing, read-only queries.
Data extraction, transformation, and validation from OLTP databases
- unnecessarily duplicated data; bad stuff, must
avoid. Leads to data inconsistency (different and conflicting
versions of the same data), i.e. lack of data integrity. aka
data anomaly (insertion/deletion/modification anomalies)
- person, place, thing, event about which data is collected and
Corresponding terms in 3 models
|Relational model (theory)
||SQL DBMS (implementation)