Friday, June 8, 2012
E-R Model Concepts
E-R Model Concepts
Entities
Entities are specific objects or things in the mini-world that are represented in the database
Ex :-a company, a job, or a university course.
What Should an Entity Be?
SHOULD BE:
An object that will have many instances in the database
An object that will be composed of multiple attributes
An object that we are trying to model
SHOULD NOT BE:
A user of the database system
An output of the database system (e.g., a report)
Characteristics of Bad Database
Characteristics of Bad Database
Not satisfy all user requirements
Would contain inconsistent and invalid data
Would require excessive effort to change
Would be slow and clumsy to use for achieving a desired outcome.
When trying to decide if some choices are better than others, you need to consider the key desirable properties of a database.
Classification of DBMSs
Classification of DBMSs
Based on the data model used:
Traditional: Relational, Network, Hierarchical.
Emerging: Object-oriented, Object-relational.
Other classifications:
Single-user (typically used with micro-computers)vs. multi-user (most DBMSs).
Centralized (uses a single computer with one database)vs. distributed (uses multiple computers, multiple databases)
Classification of DBMSs
Distributed Database Systems have now come to be known as client server based database systems because they do not support a totally distributed environment, but rather a set of database
Two Tier Client-Server Architecture-Three Tier Client-Server Architecture
Two Tier Client-Server Architecture
User Interface Programs and Application Programs run on the client side
Interface called ODBC (Open Database Connectivity ) provides an Application program interface (API) allow client side programs to call the DBMS. Most DBMS vendors provide ODBC drivers.
A client program may connect to several DBMSs.
Other variations of clients are possible: e.g., in some DBMSs, more functionality is transferred to clients including data dictionary functions, optimization and recovery across multiple servers, etc.
In such situations the server may be called the Data Server.
Three Tier Client-Server Architecture
Common for Web applications
Intermediate Layer called Application Server or Web Server:
stores the web connectivity software and the rules and business logic (constraints) part of the application used to access the right amount of data from the database server
acts like a conduit for sending partially processed data between the database server and the client.
Additional Features- Security:
encrypt the data at the server before transmission
decrypt data at the client
Other Tools
Other Tools
Data dictionary / repository
Used to store schema descriptions and other information such as design decisions, application program descriptions, user information, usage standards, etc.
Active data dictionary is accessed by DBMS software and users/DBA.
Passive data dictionary is accessed by users/DBA only.
Application Development Environments and CASE (computer-aided software engineering) tools:
Examples –Power builder (Sybase), Builder (Borland)
communications software
function is to allow users at locations remote from the database system site to access the database through computer terminals, workstations, or their local personal computers.
Database System Utilities
Database System Utilities
To perform certain functions such as:
Loading data stored in files into a database. Includes data conversion tools.
Backing up the database periodically on tape.
Reorganizing database file structures.
Report generation utilities.
Performance monitoring utilities.
Other functions, such as sorting, user monitoring, data compression, etc.
DBMS Interfaces
DBMS Interfaces
Menu-Based Interfaces for Browsing :- These interfaces present the user with lists of options, called menus, that lead the user through the formulation of a request .
Forms-Based Interfaces :- A forms-based interface displays a form to each user. Users can fill out all of the form entries to insert new data, or they fill out only certain entries, in which case the DBMS will retrieve matching data for the remaining entries.
Graphical User Interfaces :- A graphical interface (GUI) typically displays a schema to the user in diagrammatic form. The user can then specify a query by manipulating the diagram. In many cases, GUIs utilize both menus and forms.
Natural Language Interfaces :- These interfaces accept requests written in English or some other language and attempt to "understand" them.
Interfaces for Parametric Users :- Parametric users, such as bank tellers, often have a small set of operations that they must perform repeatedly.
Data Independence
Data Independence
Capacity to change the schema at one level of a database system without having to change the schema at the next higher level.
Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs.
Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema.
Data independence is accomplished because, when the schema is changed at some level, the schema at the next higher level remains unchanged; only the mapping between the two levels is changed.
However, the two levels of mappings create an overhead during compilation or execution of a query or program, leading to inefficiencies in the DBMS.
DBMS Languages
DBMS Languages
Data Definition Language(DDL): Used by the DBA and database designers to specify the conceptual schema of a database.
In many DBMSs, the DDL is also used to define internal and external schemas (views).
In some DBMSs, separate storage definition language(SDL) and view definition language(VDL) are used to define internal and external schemas.
Data Manipulation Language(DML): Used to specify database retrievals and updates.
DML commands (data sublanguage) can be embedded in a general-purpose programming language (host language), such as COBOL, C or an Assembly Language.
Alternatively, stand-alone DML commands can be applied directly (query language).
High Level or Non-procedural Languages : e.g., SQL, are set-oriented and specify what data to retrieve than how to retrieve. Also called declarative languages.
Low Level or Procedural Languages: record-at-a-time ;they specify how to retrieve data and include constructs such as looping.
Schema VS Instances
Schema VS Instances
Database Schema: The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database.
Schema Diagram: A diagrammatic display of (some aspects of) a database schema.
Schema Construct: A component of the schema or an object within the schema, e.g., STUDENT, COURSE.
Database Instance: The actual data stored in a database at a particular moment in time. Also called database state(or occurrence).
Database Schema Vs. Database State
Database State: Refers to the content of a database at a moment in time.
Initial Database State: Refers to the database when it is loaded
Valid State: A state that satisfies the structure and constraints of the database.
Distinction
The database schema changes very infrequently. The database state changes every time the database is updated.
Schema is also called intension, whereas state is called extension
Three-Schema Architecture
Three-Schema Architecture
Proposed to support DBMS characteristics of:
Program-data independence.
Support of multiple views of the data.
Use of a catalog to store the database description (schema).
Internal schema at the internal level to describe physical storage structures and access paths. Typically uses a physical data model.
Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model.
External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual level.
Network Model
Network Model
Some data were more naturally modeled with more than one parent per child.
Permitted the modeling of many-to-many relationships in data.
Consists of an owner record type, a set name, and a member record type.
A member record type can have that role in more than one set
An owner record type can also be a member or owner in another set.
Network Model
ADVANTAGES:
Network Model is able to model complex relationships and represents semantics of add/delete on the relationships.
Can handle most situations for modeling using record types and relationship types.
Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within set, GET etc. Programmers can do optimal navigation through the database
DISADVANTAGES:
Navigational and procedural nature of processing
Database contains a complex array of pointers that thread through a set of records.
Little scope for automated "query optimization”
Categories of data models
Categories of data models
Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.)
Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer.
Implementation (representational) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details
Hierarchical Model
Hierarchical Model
Organizes data in a tree structure.
This structure implies that a record can have repeating information, generally in the child data segments.
Collects all the instances of a specific record together as a record type.
To create links between these record types, the hierarchical model uses Parent Child Relationships. These are a 1:N mapping between record types.
Hierarchical Model
ADVANTAGES:
Hierarchical Model is simple to construct and operate on
Corresponds to a number of natural hierarchically organized domains -e.g., assemblies in manufacturing, personnel organization in companies
Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT etc.
DISADVANTAGES:
Navigational and procedural nature of processing
Database is visualized as a linear arrangement of records
Little scope for "query optimization"
Data Models
Data Models
Data Model: A set of concepts to describe the structure of a database, and certain constraints that the database should obey.
Data Model Operations: Operations for specifying database retrievals and updates by referring to the
concepts of the data model. Operations on the data model may include basic operations and user-defined operations.
Example of User Defined Operation
Calculate GPA
Sum(Result GPA * No. of Credits)/Sum(Credits)
Typical DBMS Functionality
Typical DBMS Functionality
Define a database : in terms of data types, structures and constraints
Construct or Load the Database on a secondary storage medium
Manipulating the database : querying, generating reports, insertions, deletions and modifications to its content
Concurrent Processing and Sharing by a set of users and programs – yet, keeping all data valid and consistent
Protection or Security measures to prevent unauthorized access
Presentation and Visualization of data
Extending Database Capabilities
Extending Database Capabilities
New functionality is being added to DBMSs in the following areas:
Scientific Applications
Image Storage and Management
Audio and Video data management
Data Mining
Spatial data management
Time Series and Historical Data Management
The above gives rise to new research and development in incorporating new data types, complex data structures, new operations and storage and indexing schemes in database systems.
When not to use a DBMS
When not to use a DBMS
Main inhibitors (costs) of using a DBMS:
High initial investment and possible need for additional hardware.
Overhead for providing generality, security, concurrency control, recovery, and integrity functions.
When a DBMS may be unnecessary:
If the database and applications are simple, well defined, and not expected to change.
If there are stringent real-time requirements that may not be met because of DBMS overhead.
If access to data by multiple users is not required.
When no DBMS may suffice:
If the database system is not able to handle the complexity of data because of modeling limitations
If the database users need special operations not supported by the DBMS.
Historical Development of Database Technology
Historical Development of Database Technology
Early Database Applications
The Hierarchical and Network Models were introduced in mid 1960’s and dominated during the seventies. A bulk of the worldwide database processing still occurs using these models.
Relational Model based Systems
The model that was originally introduced in 1970 was heavily researched and experimented with in IBM and the universities. Relational DBMS Products emerged in the 1980’s.
Object-oriented applications
OODBMSs were introduced in late 1980’s and early 1990’s to cater to the need of complex data processing in CAD and other applications. Their use has not taken off much.
Data on the Web and E-commerce Applications
Web contains data in HTML (Hypertext markup language) with links among pages. This has given rise to a new set of applications and E-commerce is using new standards like XML (eXtended Markup Language).
Advantages of Using a DBMS
Advantages of Using a DBMS
Controlling Redundancy
Restricting Unauthorized Access
Providing Persistent Storage for Program Objects and Data Structures
Permitting Inferencing and Actions Using Rules
Providing Multiple User Interfaces
Representing Complex Relationships Among Data
Enforcing Integrity Constraints
Providing Backup and Recovery
Implications of the Database Approach
Implications of the Database Approach
Potential for Enforcing Standards
This is very crucial for the success of database applications in large organizations Standards refer to data item names, display formats, screens, report structures, meta-data (description of data) etc
Reduced Application Development Time
Incremental time to add each new application is reduced
Flexibility to change data-structures
Database structure may evolve as new requirements are defined
Availability of Up-to-Date Information
Very important for on-line transaction systems such as airline, hotel, car reservations
Economies of Scale
By consolidating data and applications across departments wasteful overlap of resources and personnel can be avoided
System Analysts & Application Programmers (Software Engineers)
System Analysts & Application Programmers (Software Engineers)
System analysts
determine the requirements of end users, especially naive and parametric end users, and develop specifications for canned transactions that meet these requirements.
Application programmers
implement these specifications as programs; then they test, debug, document, and maintain these canned transactions.
Such analysts and programmers (nowadays called software engineers) should be familiar with the full range of capabilities provided by the DBMS to accomplish their tasks
Categories of End-users
Categories of End-users
People whose jobs require access to the database for querying, updating, and generating reports
Casual
Access database occasionally when needed
Naïve or Parametric
They make up a large section of the end-user population.
They use previously well-defined functions in the form of “canned transactions” against the database.
Examples are bank-tellers or reservation clerks who do this activity for an entire shift of operations.
Sophisticated
These include business analysts, scientists, engineers, others thoroughly familiar with the system capabilities.
Many use tools in the form of software packages that work closely with the stored database.
Stand-alone
Mostly maintain personal databases using ready-to-use packaged applications.
An example is a tax program user that creates his or her own internal database.
Database Users
Database Users
Users may be divided into those who actually use and control the content (called “Actors on the Scene”) and those who enable the database to be developed and the DBMS software to be designed and implemented (called “Workers Behind the Scene”).
Actors on the Scene
Database Administrators
Administrating the database & the DBMS
Responsibilities :-
Authorizing access to the database
Coordinating and monitoring the use of database
Acquiring software and hardware resources as needed
Accountable for problems such as breach of security or poor system response time
Database Designers
Responsible for identifying the data to be stored in the database and for choosing appropriate structures to represent and store this data
Responsibilities :-
Communicate with all prospective database users, in order to understand their requirement
Identifying the data to be stored in the database
Choosing appropriate structures to represent and store this data
Develop a view of the database that meets the data and processing requirements
Characteristics of the Database Approach
Characteristics of the Database Approach
Self-Describing Nature of a Database System
Insulation between Programs and Data, and Data Abstraction
Support of Multiple Views of the Data
Sharing of Data and Multiuser Transaction Processing
Self-Describing Nature of a Database System
A complete definition or description of the database structure and constraints
DBMS software works equally well with any number of database applications
DBMS catalog stores the description of the database. The description is called meta-data
Insulation between Programs and Data, and Data Abstraction
Program-data independence
Allows changing data storage structures and operations without having to change the DBMS access
programs
Program - operation independence :-
The interface (or signature) of an operation includes the operation name and the data types of its
arguments (or parameters).
The implementation (or method) of the operation is specified separately and can be changed without affecting the interface.
Data Abstraction
A data model is used to hide storage details and present the users with a conceptual view of the database
Support of Multiple Views of the Data
A database typically has many users, each of whom may require a different perspective or view of the database.
A view may be a subset of the database or it may contain virtual data that is derived from the database files but is not explicitly stored
A multiuser DBMS whose users have a variety of applications must provide facilities for defining multiple views
Sharing of Data and Multiuser Transaction Processing
Allow multiple users to access the database at the same time
Concurrency control
Allowing a set of concurrent users to retrieve and to update the database.
Concurrency control within the DBMS guarantees that each transaction is correctly executed or completely aborted.
OLTP (Online Transaction Processing) is a major part of database applications.
What is Database Management System?
What is Database Management System?
- A database management system (DBMS) is a collection of programs that enables users to create and maintain a database.
- The DBMS is hence a general-purpose software system that facilitates the processes of defining, constructing, and manipulating databases for various applications.
What is Database Management System?
- Database System: The DBMS software together with the data itself. Sometimes, the applications are also included.
Example of a Database (with a Conceptual Data Model)
Mini-world for the example: Part of a UNIVERSITY environment.
Some mini-world entities:
STUDENTs
COURSEs
SECTIONs (of COURSEs)
(academic) DEPARTMENTs
INSTRUCTORs
Example of a Database (with a Conceptual Data Model)
Some mini-world relationships:
SECTIONs are of specific COURSEs
STUDENTs take SECTIONs
COURSEs have prerequisite COURSEs
INSTRUCTORs teach SECTIONs
COURSEs are offered by DEPARTMENTs
STUDENTs major in DEPARTMENTs
Note: The above could be expressed in the ENTITY-RELATIONSHIP data model.
What is a Database?
What is a Database?
- A database is a collection of related data
- A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested.
Data & Mini World
- Data: Known facts that can be recorded and have an implicit meaning.
- Mini-world: Some part of the real world about which data is stored in a database. For example, student grades and transcripts at a university
Types of Databases and Database Applications
- Numeric and Textual Databases
- Multimedia Databases
- Geographic Information Systems (GIS)
- Data Warehouses
- Real-time and Active Databases
Subscribe to:
Posts (Atom)