SQL (Structured Query Language)
Here's a breakdown:
Relational Databases: These databases organize information
in a structured way, using tables with rows and columns. Each row represents a
record, and each column represents a specific attribute of that record.
SQL's Role:
Data Definition Language (DDL): Creates and modifies the database structure.
Examples: CREATE TABLE, ALTER TABLE, DROP TABLE
Data Manipulation Language (DML): Manipulates data within the database.
Examples: SELECT, INSERT, UPDATE, DELETE
Data Control Language (DCL): Controls access to the database.
Examples: GRANT, REVOKE
Key SQL Commands:
SELECT: Retrieves data from one or more tables.
FROM: Specifies the table(s) to retrieve data from.
WHERE: Filters the data based on specific conditions.
ORDER BY: Sorts the results based on one or more columns.
GROUP BY: Groups rows that have the same values in specified
columns.
JOIN: Combines data from two or more tables based on a
related column.
INSERT: Adds new rows to a table.
UPDATE: Modifies existing data in a table.
DELETE: Removes rows from a table.
Example:
FROM Customers
WHERE Country = 'USA';
This query retrieves all information (*) from the
"Customers" table where the "Country" column equals
"USA".
Why is SQL Important?
Data Analysis and Decision Making: SQL enables powerful data
analysis and reporting, crucial for businesses to make informed decisions.
Widely Used: Found in various applications, from e-commerce
websites to financial systems and scientific research.
Here's why SQL is considered a language:
- Syntax and Grammar: SQL has its own specific syntax and grammar rules. It uses keywords, operators, and clauses to form statements that instruct the database on what to do.
- Data
Manipulation: SQL commands allow you to manipulate data within the
database, such as:
- Retrieving
data: Using SELECT to extract specific information from tables.
- Inserting
data: Adding new records to tables using INSERT.
- Updating
data: Modifying existing data using UPDATE.
- Deleting data: Removing records from tables using DELETE.
- Data
Definition: SQL commands also define the structure of the database
itself:
- Creating
tables: Using CREATE TABLE to define the structure of tables
(columns, data types).
- Modifying
table structures: Using ALTER TABLE to add, modify, or delete
columns.
- Creating
databases: Defining and managing databases.
- Data
Control: SQL provides commands for managing access to the database,
such as:
- Granting and revoking user permissions: Controlling who can access and modify data.
Key Characteristics of SQL as a Language:
- Declarative:
SQL focuses on describing what data you want to retrieve or
manipulate, rather than how to retrieve it. This makes it easier to
write and understand SQL queries.
- Domain-Specific:
SQL is specifically designed for interacting with relational databases.
It's not a general-purpose programming language like Python or Java.
- Standard-Based:
While there are variations between different database systems, SQL adheres
to a set of standards, ensuring a degree of portability across different
platforms.
In essence, SQL possesses the fundamental characteristics of a programming language: it has its own syntax, allows for the execution of instructions, and enables the creation of complex logic for interacting with a specific type of system (relational databases).
SQL Joins
In SQL, joins are used to combine data from two or more
tables based on a related column. This allows you to retrieve information that
is spread across multiple tables within your database. Here's a breakdown of
common join types:
1. INNER JOIN
- Purpose:
Returns rows where there is a match in both tables based on the join
condition.
- Syntax:
SQL
SELECT column1, column2, ...
FROM table1
INNER JOIN table2
ON table1.column = table2.column;
2. LEFT JOIN
- Purpose:
Returns all rows from the "left" table (the one specified before
LEFT JOIN) and the matching rows from the "right" table. If
there's no match in the right table, it returns NULL values for the
columns from the right table.
- Syntax:
SQL
SELECT column1, column2, ...
FROM table1
LEFT JOIN table2
ON table1.column = table2.column;
3. RIGHT JOIN
- Purpose:
Returns all rows from the "right" table and the matching rows
from the "left" table. If there's no match in the left table, it
returns NULL values for the columns from the left table.
- Syntax:
SQL
SELECT column1, column2, ...
FROM table1
RIGHT JOIN table2
ON table1.column = table2.column;
4. FULL OUTER JOIN
- Purpose:
Returns all rows from both tables, including rows with no match in the
other table. For unmatched rows, it returns NULL values for the columns
from the other table.
- Syntax:
SQL
SELECT column1, column2, ...
FROM table1
FULL OUTER JOIN table2
ON table1.column = table2.column;
Example:
Let's say you have two tables:
- Customers:
CustomerID, CustomerName
- Orders:
OrderID, CustomerID, OrderDate
To get a list of all customers and their orders (including
customers with no orders):
SQL
SELECT Customers.CustomerName, Orders.OrderID,
Orders.OrderDate
FROM Customers
LEFT JOIN Orders
ON Customers.CustomerID = Orders.CustomerID;
This query uses a LEFT JOIN to include all customers, even
if they haven't placed any orders.
Key Considerations:
- Choose
the appropriate join type based on the specific requirements of your
query.
- Use
clear and meaningful aliases for tables to improve readability.
- Understand
the relationships between the tables you are joining.
SQL Complex Joins
Complex joins involve combining data from multiple tables
using a series of JOIN operations. They are essential for retrieving intricate
relationships within your database. Here's a breakdown of some common scenarios
and how to approach them:
1. Multiple Joins:
- Scenario:
You need to join data across three or more tables.
- Approach:
Chain multiple JOIN clauses together.
SQL
SELECT
c.CustomerName,
o.OrderID,
p.ProductName
FROM
Customers c
JOIN
Orders o ON
c.CustomerID = o.CustomerID
JOIN
Products p ON
o.ProductID = p.ProductID;
This query joins the Customers, Orders, and Products tables
to retrieve customer names, order IDs, and product names for each order.
2. Self-Join:
- Scenario:
Joining a table with itself to find relationships within the same table.
- Approach:
Create an alias for the table and join it with itself using different
column names.
SQL
SELECT
e1.EmployeeID,
e1.FirstName,
e2.FirstName AS
ManagerName
FROM
Employees e1
JOIN
Employees e2 ON
e1.ManagerID = e2.EmployeeID;
This query finds the names of employees and their respective
managers.
3. Non-Equijoins:
- Scenario:
Joining tables based on conditions other than equality (e.g., greater
than, less than).
- Approach:
Use comparison operators in the ON clause.
SQL
SELECT
c.CustomerID,
c.City,
o.OrderID
FROM
Customers c
JOIN
Orders o ON
c.CustomerID = o.CustomerID AND o.OrderDate > '2023-01-01';
This query retrieves customer IDs and cities for customers
who placed orders after January 1, 2023.
4. Cross Join:
- Scenario:
Combining every row from one table with every row from another table
(Cartesian product).
- Approach:
Use the CROSS JOIN keyword or omit the ON clause in a regular join.
SQL
SELECT
c.CustomerID,
p.ProductID
FROM
Customers c
CROSS JOIN
Products p;
This query would generate a result set with every customer
paired with every product.
Important Considerations:
- Performance:
Complex joins can impact query performance. Use indexes on the join
columns to improve efficiency.
- Readability:
Use meaningful aliases for tables and columns to make your queries easier
to understand.
- Testing:
Thoroughly test your queries to ensure they return the expected results.
By mastering these complex join techniques, you can
effectively retrieve and analyze intricate relationships within your relational
databases.
Indexes in SQL play a crucial role in optimizing the
performance of database queries, particularly for large datasets. Here's why
they are important:
- Faster
Data Retrieval:
- Reduced
Search Time: Indexes act like an index in a book, allowing the
database system to quickly locate specific rows without having to scan
the entire table. This significantly speeds up queries that involve
searching for data based on specific column values.
- Improved
Query Performance: Indexes are particularly beneficial for queries
that use WHERE clauses, ORDER BY clauses, and JOIN operations.
- Enhanced
Database Performance:
- Faster
Queries: Faster query execution leads to improved overall database
performance, resulting in quicker response times for applications that
rely on the database.
- Improved
Scalability: Indexes can help databases handle larger volumes of data
more efficiently, as queries can be executed more quickly even with a
growing dataset.
- Improved
Data Integrity:
- Unique
Indexes: Unique indexes ensure that no duplicate values exist in a
specific column, helping to maintain data integrity and consistency.
Analogy:
Imagine a library. Without an index, you would have to
search through every book to find the information you need. With an index, you
can quickly locate the relevant pages by looking up the topic in the index. Indexes
in SQL work similarly, providing a faster way to locate and retrieve specific
data within a table.
Key Considerations:
- Index
Creation: Creating indexes has a cost. It takes time and resources to
create and maintain indexes.
- Index
Maintenance: Updates, inserts, and deletes on indexed columns can
impact index performance. The database system needs to update the index to
reflect the changes, which can add overhead.
- Over-indexing:
Creating too many indexes can actually slow down database performance,
especially during data modification operations.
In summary:
Indexes are a vital component of efficient database
management. By carefully selecting the columns to index, you can significantly
improve query performance and overall database efficiency.
Subqueries
A subquery is a query within another SQL query. A subquery
is also called inner query or inner select and the statement containing a
subquery is called an outer query or outer select.
Note:
1. Subqueries must be enclosed within parenthesis,
2. An ORDER BY cannot be used in a subquery.
3. The image type such as BLOB, array, text datatypes are
not allowed in subqueries.
Subqueries can be used with select, insert, update and
delete statement within where, from, select clause along with IN, comparison
operators, etc.
We have a table named ITCompanyInNepal on which we will
perform queries to show subqueries examples:
Examples: SubQueries With Select Statement
with In operator and where clause:
SELECT * FROM ITCompanyInNepal
WHERE Headquarter IN (SELECT Headquarter FROM
ITCompanyInNepal WHERE Headquarter = 'USA');
with comparison operator and where clause
SELECT * FROM ITCompanyInNepal WHERE NumberOfEmployee <
(SELECT AVG(NumberOfEmployee) FROM ITCompanyInNepal )
with select clause
SELECT CompanyName, CompanyAddress, Headquarter, (Select
SUM(NumberOfEmployee) FROM ITCompanyInNepal Where Headquarter = 'USA') AS
TotalEmployeeHiredByUSAInKathmandu FROM ITCompanyInNepal WHERE CompanyAddress =
'Kathmandu' AND Headquarter = 'USA'
Subqueries with insert statement
We have to insert data from IndianCompany table to
ITCompanyInNepal. The table for IndianCompany is shown below:
INSERT INTO ITCompanyInNepal SELECT * FROM IndianCompany
Subqueries with update statement
Suppose all the companies whose headquarter is USA decided
to fire 50 employees from all US based companies of Nepal due to some change in
policy of USA companies.
UPDATE ITCompanyInNepal SET NumberOfEmployee =
NumberOfEmployee - 50 WHERE Headquarter IN (SELECT Headquarter FROM
ITCompanyInNepal WHERE Headquarter = 'USA')
Subqueries with Delete Statement
Suppose all the companies whose headquarter is Denmark
decided to shutdown their companies from Nepal.
DELETE FROM ITCompanyInNepal WHERE Headquarter IN (SELECT
Headquarter FROM ITCompanyInNepal WHERE Headquarter = 'Denmark'
Union and Union all
Union operation combines the results of two or more queries
into a single result set that includes all the rows that belong to all queries
in the union and will ignore any duplicates that exist. Union all also does the
same thing but include even the duplicate values. The concept of union
operation will be clear from the example below. Few things to consider while
using union are:
1.The number and the order of the columns must be the same
in all queries.
2.The data types must be compatible.
Example:
We have three tables : Marksheet1, Marksheet2 and
Marksheet3. Marksheet3 is the duplicate table of Marksheet2 which contains same
values as that of Marksheet2.
Table1: Marksheet1
Table2: Marksheet2
Table3: Marksheet3
Union on tables Marksheet1 and Marksheet2
SELECT SubjectCode, SubjectName, MarksObtained FROM
Marksheet1
UNION
SELECT CourseCode, CourseName, MarksObtained FROM Marksheet2
Note: The output for union of the three tables will
also be same as union on Marksheet1 and Marksheet2 because union operation does
not take duplicate values.
SELECT SubjectCode, SubjectName, MarksObtained FROM
Marksheet1
UNION
SELECT CourseCode, CourseName, MarksObtained FROM Marksheet2
UNION
SELECT SubjectCode, SubjectName, MarksObtained FROM
Marksheet3
OUTPUT
Union All
SELECT SubjectCode, SubjectName, MarksObtained FROM
Marksheet1
UNION ALL
SELECT CourseCode, CourseName, MarksObtained FROM Marksheet2
UNION ALL
SELECT SubjectCode, SubjectName, MarksObtained FROM
Marksheet3
You will notice here that the duplicate values from Marksheet3 are also displayed using union all.
SQL Aggregate Functions
Aggregate functions in SQL Server run calculations on sets
of values, returning a single value.
SUM()
Returns sum of numeric values in a given column.
We have table as shown in figure that will be used to
perform different aggregate functions. The table name is Marksheet.
Select SUM(MarksObtained) From Marksheet
The sum function doesn't consider rows with NULL value in
the field used as parameter In the above example
if we have another row like this:
106 Italian
NULL
This row will not be considered in sum calculation
AVG()
Returns average of numeric values in a given column.
We have table as shown in figure that will be used to
perform different aggregate functions. The table name is Marksheet.
Select AVG(MarksObtained) From Marksheet
The average function doesn't consider rows with NULL value
in the field used as parameter.
In the above example
if we have another row like this:
106 Italian NULL
This row will not be considered in average calculation
MAX()
Returns the largest value in a given column.
We have table as shown in figure that will be used to
perform different aggregate functions. The table name is Marksheet.
Select MAX(MarksObtained) From Marksheet
MIN()
Returns the smallest value in a given column.
We have table as shown in figure that will be used to
perform different aggregate functions. The table name is Marksheet.
Select MIN(MarksObtained) From Marksheet
COUNT()
Returns the total number of values in a given column. We
have table as shown in figure that will be used to perform different aggregate
functions.
The table name is Marksheet.
Select COUNT(MarksObtained) From Marksheet
The count function doesn't consider rows with NULL value in
the field used as parameter. Usually the count parameter is * (all fields) so
only if all fields of row are NULLs this row will not be considered.
In the above example
if we have another row like this:
106 Italian
NULL
This row will not be considered in count calculation
NOTE The function COUNT(*) returns the number of rows in a
table. This value can also be obtained by using a constant non-null expression
that contains no column references, such as COUNT(1).
Example
Select COUNT(1) From Marksheet
COUNT(Column_Name) with GROUP BY Column_Name
Most of the time we like to get the total number of
occurrence of a column value in a table for example:
TABLE NAME : REPORTS
ReportName ReportPrice
Test 10.00
$
Test 10.00
$
Test 10.00
$
Test 2 11.00
$
Test 10.00
$
Test 3 14.00
$
Test 3 14.00
$
Test 4 100.00
$
SELECT ReportName AS REPORT NAME, COUNT(ReportName) AS COUNT
FROM REPORTS GROUP BY ReportName
REPORT NAME COUNT
Test 4
Test2 1
Test3 2
Test4 1
Ranking Functions
Arguments |
Details |
partition_by_clause |
Divides the
result set produced by the FROM clause into partitions to which the
DENSE_RANK function is applied. For the PARTITION BY syntax, see OVER Clause
(Transact-SQL). |
order_by_clause |
Determines
the order in which the DENSE_RANK function is applied to the rows in a
partition |
OVER ( [
partition_by_clause ] order_by_clause) |
partition_by_clause
divides the result set produced by the FROM clause into partitions to which
the function is applied. If not specified, the function treats all rows of
the query result set as a single group. order_by_clause determines the order
of the data before the function is applied. The order_by_clause is required.
The of the OVER clause cannot be specified for the RANK function. For more
information, see OVER Clause (Transact-SQL). |
DENSE_RANK ()
Same as that of RANK(). It returns rank without any gaps:
Select Studentid, Name,Subject,Marks, DENSE_RANK()
over(partition by name order by Marks desc)Rank From Exam order by name
Studentid Name
Subject
Marks Rank
101 Ivan
Science 80 1
101 Ivan Maths 70
2
101 Ivan
Social 60 3
102 Ryan
Social 70 1
102 Ryan
Maths 60 2
102 Ryan
Science 50 3
103 T Tanvi
Maths 90 1
103 Tanvi
Science 90 1
103 Tanvi
Social 80 2
RANK()
A RANK() Returns the rank of each row in the result set of
partitioned column.
Eg :
Select Studentid,Name,Subject,Marks, RANK() over(partition
by name order by Marks desc)Rank From Exam order by name,subject
Studentid Name
Subject Marks Rank
101
Ivan Maths 70 2
101 Ivan
Science 80 1
101 Ivan
Social 60 3
102 Ryan
Maths 60 2
102 Ryan
Science 50 3
102 Ryan
Social 70 1
103 Tanvi
Maths 90 1
103 Tanvi
Science 90 1
103 Tanvi
Social 80 3
Labels: Avg(), Cross Join, Data Analysis using SQL, DCL, DDL, DML, Equi Join, Max(), Min(), Rank(). Dense_Rank(), Self Join, SQL (Structured Query Language), SQL Joins, Subqueries, Sum(), Union and Union All
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home