Unit 33 Data Analysis and Design Sample Assignment

Task 1:

Need help?

1.1-Critically compare different data models and schemas.

Task 2:

Task 3:

Task 4:

1.2- Compare database systems with file based systems and also discuss the benefits and limitations of different database technologies with reference to above scenario

1.3- Analyze different approaches to database design.

2.1 - Discuss the principles of normalization and steps followed to achieve normal forms

2.2 – ER Diagram for English Premier League with all entity sets, attributes and relationships and cardinality constraints

2.3 –Relational Data Model

2.4 – Create tables using SQL (DDL) commands

2.5 – Screenshots from SQL Server Management Studio (IDE)

3.1 - Benefits of using manipulation and query tools in a relational database system

3.2 – Populate tables with some data and answer queries –

3.3 - Critically evaluate the validity of the data extracted using the above queries and comment on the design process followed to ensure that meaningful data is extracted through the use of query tools.

4.1 - Critically review and test the relational database system designed for the given scenario and thus provide a documentation supporting implementation and testing of the relational database system developed.

4.2 - Create brief user documentation for the relational database system developed.

4.3 - By giving a tabular V&V document, explain how verification and validation has been addressed.

4.4 - Explain control mechanisms and show how these techniques have been used in developing your system.

References:

Data Models tells us how to create logical design of database. There are three major types of data models:

Relational Model – In this model data is organized in form of tables or relations which are related to each other.
Consider an example, where there is a company with various departments. Each department has certain manager and employees. Every entity is related to each other.

Hierarchical Model - In this model, each entity has one parent or root but may have one or several children or child entities.
Network Model – In this model, all entities are organized in form of graph. Some entities can be accessed from more than one path.

File systems and database management systems (DBMS) are two ways of managing data. DBMS is a computer based system to record persistent data.

In database management system, data is stored in large databases, while in file system, data is stored in form of files.
In file system, tasks like storage, retrieval, and search are done manually while in DBMS such tasks are automated, usually done by help of some tool.
Manual process of executing tasks invites problems like data integrity, data inconsistency and data security while such problems can be avoided in DBMS.
In DBMS certain control mechanisms are used and reading line by line is not required, whereas this is not the case with file systems.
We can avoid unauthorized access to database in DBMS whereas in file systems this is not possible.
With DBMS we can make use of backup and recovery, whereas in file systems, data once lost cannot be recovered back.
In DBMS, multiple users can access the data at one time, whereas with file systems this is not possible. File systems are basically designed for single user to access it at a time.
DBMS avoids data duplication. If a set of data is required by multiple web application, it is made available, whereas in file system this is not possible, one program is not readable by another program.
DBMS coordinates both logical and physical access to data, while file system only provides physical access to data.

Organizations these day’s follow four major techniques of designing databases. These are:

Top-Down Approach – This starts with a general idea of what is required for the system and moves towards more specific details. This process involves identifying various entities and their attributes.
Bottom-Up Approach–This approach is just reverse of top down approach. It starts from more specific details and moves down to general details. In this approach, attributes are identified prior to entities.
Centralized Approach–This approach is useful in small organizations, where there are less number of entities and interactions between entities and attributes is easy. This approach is even considered in some large organizations which follow a relatively simple database structure. Normally it has a single database administrator or a small website design team that designs the entire database for the organization. Problem or specification is defined by this person or design team, a conceptual design is created, this conceptual design is verified by user views and system processes and data constraints are defined, to ensure that developed design complies with organization’s requirements.
Decentralized Approach–This approach is useful when the system has large number of entities and complex relationships exist among them. A team of carefully selected designers is employed for large or complex projects, to accomplish the job.

Database normalization is a technique to arrange, manage or organize the data (fields and tables) in a relational database. This process is followed to minimize redundancy and to ensure that data is logically stored.
If our database is not normalized, it becomes difficult to maintain and update database without data loss. We might face three kinds of anomalies if database is not normalized. These are:
Insertion Anomaly:
Consider table Manager:

Manager_id	Manager_name	Manager_address	Department
01	Neil	Gurgaon	Marketing
02	Edwin	Mumbai	Production
03	Stuart	Delhi	Production
04	Neil	Gurgaon	Sales

Suppose for table Manager, for a new entry we have manager_id, manager_name, manager_address, but the newly recruited person has not been assigned any department yet, still then we have to insert NULL there, leading to insertion anomaly.
Deletion Anomaly:
In above table, if for ‘01’ manager_id there is only one department assigned, but if he temporarily drops it, then we might have to delete entire record corresponding to it, rather than only one column.
Updation Anomaly:
To update the address of the manager, who occurs more than once in a table, we need to update address in all entries, else data will become inconsistent. E.g.: In above table, Manager_name = Neil occurs more than once, so manager_address entries needs to be changed two times in the table.
Therefore, to remove such anomalies, we need to normalize our database tables.
There are 4 major types of Normal Forms:
First Normal Form (1NF):According to this rule, no two groups of data or records in table must contain same information. E.g.:
Manager Table: (Before normalization)

Manager_name	Manager_age	Manager_department
Neil	45	Marketing, Sales
Edwin	42	Production
Stuart	49	Textile

Manager Table: (After Normalization)

Manager_name	Manager_age	Manager_department
Neil	45	Marketing
Neil	45	Sales
Edwin	42	Production
Stuart	49	Textile

After normalizing manager table, each row has unique data, but data redundancy increases.
Second Normal Form (2NF): There must not be any partial dependencies, according to this rule. The concatenation must depend on primary key.
Manager table after 2NF:

Manager name	Manager age
Neil	45
Edwin	42
Stuart	49

Manager_name	Manager_department
Neil	Marketing
Neil	Sales
Edwin	Production
Stuart	Textile

Third Normal Form (3NF): According to this rule, all the transitive dependencies must be removed from the table and every non-prime attribute must be dependent on primary key.

Manager_id	Manager_name	Manager_age	Manager_street	Manger_city	Manager_zip

New tables after 3NF:

Boycc Codd Normal Form (BCNF):A table is said to be in BCNF, if it is in 3NF and if it does not have multiple overlapping candidate keys.

ER Diagram for English Premier League

Table	Attributes	Data Type	Constraints
Manager	Manager_id	int	Primary Key
	Manager_fname	Varchar(50)
	Manager_lname	Varchar(50)
	Manager_dob	datetime
	Manager_pob	Varchar(50)

Players	Player_id	int	Primary key
	Player_fname	Varchar(50)
	Player_lname	Varchar(50)
	Player_dob	datetime
	Player_pob	Varchar(50)
	Game_id	int	Foreign Key
	score	int

Coach	Coach_id	int	Primary key
	Coach_fname	Varchar(50)
	Coach_lname	Varchar(50)

Refree	Refree_id	int	Primary Key
	Refree_fname	Varchar(50)
	Refree_lname	Varchar(50)

Team	Team_id	int	Primary Key
	Team_name	Varchar(50)
	Score	int
	city	Varchar(50)
	Manager_id	int	Foreign key
	Coach_id	int	Foreign Key
	Player_id	int	Foreign Key
	Game_id	int	Foreign Key

Table: Manager

createtable manager
CODING A

CODING A1

Creation of Table -Manager:

Creation of Table -Players:

Creation of table – Coach:

Creation of Table – refree:

Creation of Table – Team:

Creation of Table – Game:

Update table with Primary Key Constraint:

Insertion of Data into Table:

Updation of Data into table:

Get Complete Solution From Best Locus Assignment Experts.

Place an order

SQL DDL commands – DDL stands for data definition language. These commands help us to define the database structure.
Create Table statement, Drop table statement and Alter table statement are three major commands which help us to define structure of database and schema.
Create Table statement: This statement helps us to create table in database. E.g.: CODING A2

Alter table statement: This statement helps us change a value in a database table. E.g.:
altertable manager1
addprimarykey(manager_id);
Drop table statement: This statement helps us to delete a database table along with its schema. E.g.:
droptable manager1;
SQL DML commands – DML stands for data manipulation language. These help us to manage data with schema objects.
Insert statement, update statement and delete statement are three major commands which help us to manipulate with the date in the database tables.
Insert Statement: It helps us to insert data or records in database table.
E.g.–To insert a record of data into table manager we write following query in SQL Server Management Studio IDE:
Insertinto manager
(manager_id,manager_fname,manager_lname,manager_dob,manager_pob)
values(1,'Neil','Armstrong','february 27,1984','london')
Delete Statement: It helps us to delete data from database table.
E.g. – To delete a record of data from table manager, we write following query in SQL Server Management Studio IDE:
deletefrom manager
where manager_id = 2;
Update Statement : This statement helps us to modify data from database table.
Eg – To update a record of data from table manager, we have to write following query in SQL Server Management Studio IDE:
update manager
set manager_fname ='Edwin'where manager_id ='1';
By above query, the record with manager_id = 1 will update, manager_fname to Edwin.

List the managers of each team.
Manager_fname, Manager_lname from manager
team
On manager.manager_id = team.manager_id;
Output the full name of the top scorer in the league with the number of goals scored.
Selecttop 1 players. player_fname, players.player_lname, players.score from players
            Innerjoin team
            On players.game_id = team.game_id
            Orderby team.score desc
Output the average number of goals per game of all games played.
AVG(score)as'Average Score'from game
List all of the games that were played in a particular ‘city’.
Select game_id from Game where city = ‘Manchester’;
List all players who have played a game that was refereed by a particular referee (choose any referee).
Select players.player_fname, players.player_lname from players, game, team, refree
Where game.refree_id = team.refree_id and
players.player_id = team.player_id and
team.refree_id = refree.refree_id and
refree_fname ='smith';
Create a database view for your system to produce a list of fixtures yet to be played along with the date of each game.
Createview fixture as
Select game_id,date
From game
Where(date >='')

All the queries mentioned above have been executed using SQL Server Management Studio IDE, and are thus successful in retrieving relevant information. By making use of SQL DDL (Data definition language) and DML (Data Manipulation Language), meaningful data can be fetch in IDE.

There are six entities in this given scenario based upon which I have designed the relational database for the system. Manager, Players, Coaches, Refree, Team, Game. I have developed the relational database for this system by defining tables with primary key and foreign key constraints. Each entity and relationship has been converted into table, and attributes into corresponding fields.
The database thus developed has important data which must be tested from time to time for any discrepancy. In day to day basis, databases are accessed by thousands of users in an organization. These databases even have important functionality like stored procedures, stored functions, triggers, views, instances, queries etc. There must be a complete regression test suite, to be run over the database periodically. There are number of reasons why we must test our database on regular basis:

Database is an important asset, by which hundreds or thousands of users are associated. If there is any loss of data, people associated with it will be affected and they may have to incur huge loss.
Databases of large organizations incorporate mission critical data in them, which contains their business functionality, so it is essential to secure it from unidentified access and any unforeseen disasters.
Testing tells us if there are any defects in the system, and whether we must take any steps to remove those errors.
If we make changes or modifications in our database, then sometimes it may result in some errors like overlapping, redundancy or missing data. Regression testing helps us detect those errors.

According to given scenario, English premier league has managers, players, and coaches, refree. Teams participate in English premier league and play games. Team can be either home team or away team. The players who play the game can be goalkeeper, midfielder, defender, and striker.
The entire database developed for this scenario is normalized up to 3nf (3^rd normal form) and thus there is no transitive dependency or redundancy.
In this project, I have created 6 database tables: Manager, Players, Coach, Refree, Team, and Game.
Description of tables is as follows:

Manager : Manager_id (pk), manager_fname, manager_lname, manager_dob, manager_pob
Players : Player_id(pk), player_fname, player_lname, player_dob, player_pob, score, game_id(fk)
Coach : Coach_id(pk), coach_fname, coach_lname
Refree : Refree_id(pk), refree_fname, refree_lname
Team : Team_id(pk), team_name, score, city, manager_id(fk), coach_id(fk), player_id(fk), score, refree_id(fk)
Game: Game_id(pk), team_id(fk), refree_id(fk), date, score

Where pk represents primary key and fk represents foreign key.

We have V-Model in software development, popularly known as verification and validation model. It is executed in sequential manner and each phase must be completed before the next phase begins. These tasks are performed to identify how consumers or users perceives the final product or software. Is the final product satisfy the quality standards and is it good to be used.

Verification	Validation

It is the process in software development life cycle to check if we are in the right track of developing the product or software.	It is the process in software development life cycle to check whether the final product meets business needs.

This process involves checking the intermediary documents like ER Diagrams, database design, website design, test cases, test plan, design documents, traceability matrix.	This process involves checking that all the design documents and requirement specification reflect the requirement and are fit to be used. When placed in intended environment, it is to be checked that product fulfills its intended use.

Verification is done before software execution	Validation involves software execution

This process involves all techniques of static testing.	This process involves all techniques of dynamic testing.

Examples of verification are : Reviews, inspection, walkthrough	Examples of validation are : All types of testing like Regression testing, System testing, User Acceptance testing, smoke testing etc.

In this project, all design documents like ER Diagrams and requirement specifications have been verified and all system requirements have been addressed therein.	After system completion, it will undergo various types of testing to check if it is fit to be used.

Control mechanisms are methods or process to define and manage variables in a desirable way. For example a test manager at deployment site might install a variety of control mechanisms to help them monitor various testing activities in software testing life cycle.
We need to apply control mechanisms over our system to ensure that a high standard of quality is met. To keep the entire process on track, it is important to have quality control. It helps us to figure out the problem, fix it and also helps us in judging the effectiveness of implemented solution.
The entire process of quality control must go on smoothly. A control feature must be devised, to manage major portion of work in standardized manner.
If in the entire process of controlling the system work flow, we are not able to achieve any task by straight method, we need to devise some alternative control mechanisms, which comply with the standardized methods.
In the process of controlling, one must also know how to react in case of defects. The areas of system where defects are most likely to occur must be monitored carefully, so that the defects can be caught and fixed there and then, rather than continuing the work flow further. Defects must be caught and treated in their nascent stage or must be reduced to near zero, so that we can say that six sigma is attained.

Manager_id

Manager_name

Manager_age

Manager_zip

Manager_street

Manger_city

Manager_zip

ER Diagram in DBMS [Online]. [Accessed on 17 November 2014]. Available on world wide web: <http://www.studytonight.com/dbms/er-diagram.php>
Structured Query Language/Data Manipulation Language [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://en.wikibooks.org/wiki/Structured_Query_Language/Data_Manipulation_Language>
Database Testing [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.agiledata.org/essays/databaseTesting.html>
Why to test a database [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.softwaretestinggenius.com/why-should-we-test-a-relational-database>
What is verification and validation [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.softwaretestinghelp.com/what-is-verification-and-validation/>
Six Sigma control phase [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.managementstudyguide.com/six-sigma-control-phase.htm>
Normalization of Database [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.studytonight.com/dbms/database-normalization.php>
Database design strategies [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://databasemanagement.wikia.com/wiki/Database_Design_Strategies>
Difference between DBMS and File System [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.differencebetween.com/difference-between-dbms-and-vs-file-system/>
Difference between file processing system and database management system [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://studychacha.com/discuss/135910-difference-between-file-processing-system-database-management-system.html>