Clouds
Home

SQL Interview Questions 1
SQL Interview Questions 2
RUP Interview Questions
UML Interview Questions

SQL Interview Questions 1


Explain inner and outer joins ?

Joins allow database users to combine data from one table with data from one or more other tables (or views, or synonyms). Tables are joined two at a time making a new table containing all possible combinations of rows from the original two tables.
Let's take an example

Employees

ID FIRST_NAME SURNAME STATE
1001 Tapan Bhardwaj Meghalaya
1002 Kundan Upadhyaya Haryana
1003 Pawan Bhardwaj Delhi
1004 Vishal Bhardwaj Meghalaya

Executives

ID FIRST_NAME SURNAME STATE
1001 Tapan Bhardwaj Meghalaya
1002 Kundan Upadhyaya Haryana
1005 Prasanna Kaushik AUS

Inner Joins:

Chooses the join criteria using any column names that happen to match between the two tables. The example below displays only the employees who are executives as well.
SELECT emp.first_name, exec.surname FROM employees emp, executives exec WHERE emp.id = exec.id;

The output is:

FIRST_NAME SURNAME
Tapan Bhardwaj
Kundan Upadhyaya

Left Outer Join:

A problem with the inner join is that only rows that match between tables are returned. The example below will show all the employees and fill the null data for the executives.
SELECT emp.first_name, exec.surname FROM employees emp left join executives exec ON emp.id = exec.id;

On Oracle

SELECT emp.first_name, exec.surname FROM employees emp, executives exec WHERE emp.id = exec.id(+);

The output is:

FIRST_NAME SURNAME
Tapan Bhardwaj
Kundan Upadhyaya
Pawan
Vishal

Right Outer Join:

A problem with the inner join is that only rows that match between tables are returned. The example below will show all the executives and fill the null data for the employees.
SELECT emp.first_name, exec.surname FROM employees emp right join executives exec ON emp.id = exec.id;

On Oracle

SELECT emp.first_name, exec.surname FROM employees emp, executives exec WHERE emp.id(+) = exec.id;

The output is

FIRST_NAME SURNAME
Tapan Bhardwaj
Kundan Upadhyaya
Kaushik

Full Outer Join:

To cause SQL to create both sides of the join
SELECT emp.first_name, exec.surname FROM employees emp full join executives exec ON emp.id = exec.id;

On Oracle

SELECT emp.first_name, exec.surname FROM employees emp, executives exec WHERE emp.id = exec.id (+)
UNION
SELECT emp.first_name, exec.surname FROM employees emp, executives exec WHERE emp.id(+) = exec.id

Note:

Oracle9i introduced the ANSI compliant join syntax. This new join syntax uses the new keywords inner join, left outer join, right outer join, and full outer join, instead of the (+) operator

The output is:

FIRST_NAME SURNAME
Tapan Bhardwaj
Kundan Upadhyaya
Kaushik
Pawan
Vishal

Self Join:

A self-join is a join of a table to itself. If you want to find out all the employees who live in the same city as employees whose first name starts with Tapan, then one way is to use a sub-query as shown below: SELECT emp.first_name, emp.surname FROM employees emp WHERE state IN (SELECT city FROM employees where first_name like ‘Tapan’)
The sub-queries can degrade performance. So alternatively we can use a self-join to achieve the same results.

On Oracle

SELECT emp.first_name, emp.surname FROM employees emp, employees emp2 WHERE emp.city = emp2.city AND emp2.first_name LIKE 'Tapan'

The output is

FIRST_NAME SURNAME
Tapan Bhardwaj
Vishal Bhardwaj

Explain a sub-query? How does a sub-query impact on performance ?

It is possible to embed a SQL statement within another. When this is done on the WHERE or the HAVING statements, we have a subquery construct. What is subquery useful for? It is used to join tables and there are cases where the only way to correlate two tables is through a subquery.
SELECT emp.first_name, emp.surname FROM employees emp WHERE emp.id NOT IN (SELECT id FROM executives);
There are performance problems with sub-queries, which may return NULL values. The above sub-query can be re-written as shown below by invoking a correlated sub-query:
SELECT emp.firstname, emp.surname FROM employees emp WHERE emp.id NOT EXISTS (SELECT id FROM executives);
The above query can be re-written as an outer join for a faster performance as shown below:
SELECT emp.firstname, exec.surname FROM employees emp left join executives exec on emp.id = exec.id AND exec.id IS NULL;
The above execution plan will be faster by eliminating the sub-query.

What is normalization ?

Normalization is a design technique that is widely used as a guide in designing relational databases. Normalization is essentially a two step process that puts data into tabular form by removing repeating groups and then removes duplicated data from the relational tables (Additional reading recommended).
Redundant data wastes disk space and creates maintenance problems. If data that exists in more than one place must be changed, the data must be changed in exactly the same way in all locations which is time consuming and prone to errors. A change to a customer address is much easier to do if that data is stored only in the Customers table and nowhere else in the database.
Inconsistent dependency is a database design that makes certain assumptions about the location of data. For example, while it is intuitive for a user to look in the Customers table for the address of a particular customer, it may not make sense to look there for the salary of the employee who calls on that customer. The employee's salary is related to, or dependent on, the employee and thus should be moved to the Employees table. Inconsistent dependencies can make data difficult to access because the path to find the data may not be logical, or may be missing or broken.

First Normal Form

A database is said to be in First Normal Form when all entities have a unique identifier or key, and when every column in every table contains only a single value and doesn't contain a repeating group or composite field.

Second Normal Form

A database is in Second Normal Form when it is in First Normal Form plus every non-primary key column in the table must depend on the entire primary key, not just part of it, assuming that the primary key is made up of composite columns.

Third Normal Form

A database is in Third Normal Form when it is in Second Normal Form and each column that isn't part of the primary key doesn't depend on another column that isn't part of the primary key.

When to denormalize ?

Typically, transactional databases are highly normalized. This means that redundant data is eliminated and replaced with keys in a one-to-many relationship. Data that is highly normalized is constrained by the primary key/foreign key relationship, and thus has a high degree of data integrity. Denormalized data, on the other hand, creates redundancies; this means that it's possible for denormalized data to lose track of some of the relationships between atomic data items. However, since all the data for a query is (usually) stored in a single row in the table, it is much faster to retrieve.

How do you implement one-to-one, one-to-many and many-to-many relationships while designing tables ?

One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships.
One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships.
Many-to-Many relationships are implemented using join table with the keys from both the tables forming the composite primary key of the junction table.

How can you performance tune your database ?

  • Denormalize your tables where appropriate.
  • Proper use of index columns: An index based on numeric fields is more efficient than an index based on character columns.
  • Reduce the number of columns that make up a composite key.
  • Proper partitioning of tablespaces and create a special tablespace for special data types like CLOB, BLOB etc.
  • Data access performance can be tuned by using stored procedures to crunch data in the database server to reduce the network overhead and also caching data within your application to reduce the number of accesses.
  • How will you map objects to a relational database ? How will you map class inheritance to relational data model ?

    Due to impedance mismatch between object and relational technology you need to understand the process of mapping classes (objects) and their relationships to tables and relationships between them in a database. Classes represent both behavior and data whereas relational database tables just implement data. Database schemas have keys (primary keys to uniquely identify rows and foreign keys to maintain relationships between rows) whereas object schema does not have keys and instead use references to implement relationships to other objects.
  • Classes map to tables in a way but not always directly.
  • An attribute of a class can be mapped to zero or more columns in a database. Not all attributes are persistent.
  • Some attributes of an object are objects itself. For example an Employee object has an Address object as an attribute. This is basically an association relationship between two objects.
  • In its simple form an attribute maps to a single column whereas each has same type (i.e. attribute is a string and column is a char, or both are dates etc). When you implement mapping with different types (attribute is a currency and column is a float) then you will need to be able to convert them back and forth.
  • What is a view ? Why will you use a view? What is an aggregate function ?

    View is a precompiled SQL query, which is used to select data from one or more tables. A view is like a table but it doesn’t physically take any space (i.e. not materialized).
    Views are used for
  • Providing inherent security by exposing only the data that is needed to be shown to the end user.
  • Enabling re-use of SQL statements.
  • Allows changes to the underlying tables to be hidden from clients, aiding maintenance of the database schema (i.e. encapsulation).

  • Views with multiple joins and filters can dramatically degrade performance because views contain no data and any retrieval needs to be processed. The solution for this is to use materialized views or create de-normalized tables to store data. This technique is quite handy in overnight batch processes where a large chunk of data needs to be processed. Normalized data can be read and inserted into some temporary denormalized table and processed with efficiency.

    What is a database trigger ?

    A trigger is a fragment of code that you tell to run before or after a table is modified. There are typically three triggering EVENTS that cause trigger to 'fire':
  • INSERT event (as a new record is being inserted into the database).
  • UPDATE event (as a record is being changed).
  • DELETE event (as a record is being deleted).

  • Triggers can restrict access to specific data, perform logging, or audit access to data.

    How can you keep track of all your database changes ?

    If you want to keep track of all changes to a particular record, such as who modified the record, what kind of modification took place, and when the record modification occurred then you can use triggers because you can capture every action that occurred on a particular table. For example, an INSERT trigger would fire when a particular database table has a record inserted.

    SQL Interview Questions 2 >>>




    Home Clouds