Q1.Optimistic vs. Pessimistic locking
Optimistic Locking is a strategy where you read a record, take note of a version number (other methods to do this involve dates, timestamps or checksums/hashes) and check that the version hasn’t changed before you write the record back. When you write the record back you filter the update on the version to make sure it’s atomic. (i.e. hasn’t been updated between when you check the version and write the record to the disk) and update the version in one hit.If the record is dirty (i.e. different version to yours) you abort the transaction and the user can re-start it.
This strategy is most applicable to high-volume systems and three-tier architectures where you do not necessarily maintain a connection to the database for your session. In this situation the client cannot actually maintain database locks as the connections are taken from a pool and you may not be using the same connection from one access to the next.Optimistic locking doesn’t necessarily use a version number. Other strategies include using (a) a timestamp or (b) the entire state of the row itself. The latter strategy is ugly but avoids the need for a dedicated version column, in cases where you aren’t able to modify the schema.
Pessimistic Locking is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking but requires you to be careful with your application design to avoid Deadlocks. To use pessimistic locking you need either a direct connection to the database (as would typically be the case in a two tier client server application) or an externally available transaction ID that can be used independently of the connection.
In the latter case you open the transaction with the TxID and then reconnect using that ID. The DBMS maintains the locks and allows you to pick the session back up through the TxID.
Optimistic locking is used when you don’t expect many collisions. It costs less to do a normal operation but if the collision DOES occur you would pay a higher price to resolve it as the transaction is aborted.Pessimistic locking is used when a collision is anticipated. The transactions which would violate synchronization are simply blocked.
To select proper locking mechanism you have to estimate the amount of reads and writes and plan accordingly
Optimistic needs a three-tier architectures where you do not necessarily maintain a connection to the database for your session whereas Pessimistic Locking is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking you need either a direct connection to the database.optimistic (versioning) is faster because of no locking but (pessimistic) locking performs better when contention is high and it is better to prevent the work rather than discard it and start over.Optimistic locking works best when you have rare collisions
Q2.What is the Need for Indexing in Database Tables?
An index can be used to efficiently find all rows matching some column in your query and then walk through only that subset of the table to find exact matches. If you don’t have indexes on any column in the WHERE clause, the SQL server have to walk through the whole table and check every row to see if it matches, which may be a slow operation on big tables.The index can also be a UNIQUE index, which means that you cannot have duplicate values in that column, or a PRIMARY KEY which in some storage engines defines where in the database file the value is stored.
Q3.Clustered and Non Clustered Index
A Clustered index determines the physical order of data in a table.There can be only one clustered index per table (the clustered index IS the table). All other indexes on a table are termed non-clustered.A clustered index means you are telling the database to store close values actually close to one another on the disk. This has the benefit of rapid scan / retrieval of records falling into some range of clustered index values.
For example, you have two tables, Customer and Order:
Customer ---------- ID Name Address Order ---------- ID CustomerID Price
If you wish to quickly retrieve all orders of one particular customer, you may wish to create a clustered index on the “CustomerID” column of the Order table. This way the records with the same CustomerID will be physically stored close to each other on disk (clustered) which speeds up their retrieval.The index on CustomerID will obviously be not unique, so you either need to add a second field to “uniquify” the index or let the database handle that for you but that’s another story.
Since the clustered index is actually related to how the data is stored, there is only one of them possible per table (although you can cheat to simulate multiple clustered indexes).
A non-clustered index is different in that you can have many of them and they then point at the data in the clustered index. You could have e.g. a non-clustered index at the back of a phone book which is keyed on (town, address)
You can have only one clustered index per table because this defines how the data is physically arranged. If you wish an analogy, imagine a big room with many tables in it. You can either put these tables to form several rows or pull them all together to form a big conference table, but not both ways at the same time. A table can have other indexes, they will then point to the entries in the clustered index which in its turn will finally say where to find the actual data.
Clustered Index
- Only one clustered index can be there in a table
- Sort the records and store them physically according to the order
- Data retrieval is faster than non-clustered indexes
- Do not need extra space to store logical structure
Non Clustered Index
- There can be any number of non-clustered indexes in a table
- Do not affect the physical order. Create a logical order for data rows and use pointers to physical data files
- Data insertion/update is faster than clustered index
- Use extra space to store logical structure
Q4.What is Staging/Factory table?
Staging table is a temporary table that is used to stage the data for temporary purpose just before loading it to the Target table from the Source Table. As the data resides temporarily, you can do various stuff on that data like
De-duping,Cleansing, Normalizing to multiple tables, De-Normalizing from multiple to a single table and Extrapolating
Q5.Staging vs Temp table?
Staging tables are permanent table just database tables containing your business data in some form or other. Staging is the process of preparing your business data, usually taken from some business application.Temporary tables can be created at runtime and can do the all kinds of operations that one normal table can do. But, based on the table types, the scope is limited. These tables are created inside tempdb database.When we are doing large number of row manipulation in stored procedures. This is useful to replace the cursor. We can store the result set data into a temp table, then we can manipulate the data from there. When we are having a complex join operation.
Permanent table is faster if the table structure is to be 100% the same since there’s no overhead for allocating space and building the table.
Temp table is faster in certain cases (e.g. when you don’t need indexes that are present on permanent table which would slow down inserts/updates)
Q6.What are different types of Tables?
Normal tables are exactly that, physical tables defined in your database.
Local temporary tables are temporary tables that are available only to the session that created them. These tables are automatically destroyed at the termination of the procedure or session that created them.
Global temporary tables are temporary tables that are available to all sessions and all users. They are dropped automatically when the last session using the temporary table has completed. Both local temporary tables and global temporary tables are physical tables created within the tempdb database.
Table variables are stored within memory but are laid out like a table. Table variables are partially stored on disk and partially stored in memory. It’s a common misconception that table variables are stored only in memory. Because they are partially stored in memory, the access time for a table variable can be faster than the time it takes to access a temporary table.
Q7. Procedures vs Functions?
Stored Procedures (SP) | Functions (UDF – User Defined Function) |
---|---|
SP can return zero , single or multiple values. | Function must return a single value (which may be a scalar or a table). |
We can use transaction in SP | We can’t use transaction in UDF. |
SP can have input/output parameter. | Only input parameter |
We can call function from SP. | We can’t call SP from function. |
We can’t use SP in SELECT/ WHERE/ HAVING statement. | We can use UDF in SELECT/ WHERE/HAVING statement. |
We can use exception handling using Try-Catch block in SP. | We can’t use Try-Catch block in UDF. |
Q8.Table vs View?
Table is a preliminary storage for storing data and information in RDBMS. A table is a collection of related data entries and it consists of columns and rows.
A view is a virtual table whose contents are defined by a query. Unless indexed, a view does not exist as a stored set of data values in a database. The advantage of a view is that it can join data from several tables thus creating a new view of it
Advantages over table are
- We can combine columns/rows from multiple table or another view and have a consolidated view.
- Views can be used as security mechanisms by letting users access data through the view, without granting the users permissions to directly access the underlying base tables of the view
- It acts as abstract layer to downstream systems, so any change in schema is not exposed and hence the downstream systems doesn’t get affected.
- Instead of sending the complex query to the database all the time, you can save the query as a view and then SELECT * FROM view
Q9.Can View could be indexed?
Yes. View can be indexed.The big disadvantage of indexed views is that they are recreated every time the underlying table data changes. That restricts the use of indexed views to data that does not change often, typically in a data warehouse or business intelligence environment.
Q10.Which one is Faster Optimistic or Pessimistic?
Optimistic locking assumes concurrent transactions can complete without affecting each other. So Optimistic locking is faster because no locks are enforced while doing transactions.Optimistic locking does not cause transactions to wait for each other.Optimistic locking possibly causes a transaction to fail, but it does so without any “lock” ever having been taken.
The word “optimistic” derives from exactly the “I will not be taking actual locks because I hope they won’t be needed anyway. If it turns out I was wrong about that, I will accept the inevitable failure.”