Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
136 Cards in this Set
- Front
- Back
What is the difference between the functions without the TRY and their counterparts with the TRY
SELECT CAST('abc' AS INT); SELECT TRY_CAST('abc' AS INT); |
that those without the TRY fail if the value isn’t convertible, whereas those with the TRY return a NULL in such a case.
SELECT CAST('abc' AS INT); << FAILS SELECT TRY_CAST('abc' AS INT); << RETURNS NULL |
|
As for the difference between CAST, CONVERT, and PARSE,
|
with CAST, you indicate the expression and the target type
with CONVERT there’s a third argument representing the style for the conversion, which is supported for some conversions, like between character strings and date and time values. For example, CONVERT(DATE, '1/2/2012', 101) converts the literal character string to DATE using style 101 representing the United States standard PARSE, you can indicate the culture by using any culture supported by the Microsoft .NET Framework. For example, PARSE('1/2/2012' AS DATE USING 'en-US') parses the input literal as a DATE by using a United States English culture. |
|
Float short comings (code)
|
DECLARE @F AS FLOAT = '29545428.022495';
SELECT CAST (@F AS NUMERIC (28, 14)) AS value; ---------------------------------------------------------------- 29545428.02249500200000 |
|
What is the difference between GETDATE and CURRENT_TIMESTAMP
|
GETDATE is T_SQL specific
CURRENT_TIMESTAMP is standard they both return the current date and time in SQL Server instance |
|
Note that there are no build-in functions to return the current date or the current time, to get such info ....
|
simply cast the SYSDATETIME function to DATE or TIME, respectively
CVAST (SYSDATETIME() AS DATE) |
|
DATEPART func.
|
Using the DATEPART func, you can extract from an input date and time value a desired part, such as year, min, etc.
DATEPART(month,'20120212') returns 2 |
|
DATEPART func and YEAR, MONTH, DATE
|
T-SQL provides the functions YEAR, MONTH, DATE as abbr to DATEPART
SELECT DATEPART(YEAR, '20130317') - return 2013 SELECT DATEPART(MONTH, '20130317') - return 03 SELECT DATEPART(DAY, '20130317') - return 17 |
|
DATENAM func
|
similar to DATEPART but returns character stings opposed to int value
SELECT DATENAME(MONTH, '20130317') - returns March |
|
T-SQL provide a set of functions that construct a desired date and time value from its numeric parts (list)
|
DATEFROMPARTS
DATETIME2FROMPARTS DATETIMEFROMPARTS DATETIMEOFFSETFROMPARTS SMALLDATETIMEFROMPARTS TIMEFROMPARTS |
|
Functions that compute the respective end of moth date for the input date and time
|
EOMONTH
EOMONTH(SYSDATETIME()) - in Feb will return 2012-02-29 |
|
DATEADD def
DATEDIFF def |
DATEADD - add a requested number of unites of a specified part to a specified date and time value
DATEADD(year, 1, '20120212) adds 1 year to 2012-02-12 DATEDIFF - computes the difference in days between two dates DATEDIFF(year, '20111131', '20120101') returns 1 NOTE: as we specified year in doesnt take in account days and month |
|
COALESCE function accepts a list of expressions as input and returns the first that is not NULL, or NULL if all are NULLs. For example, the expression COALESCE(NULL, 'x', 'y') returns 'x'.
|
COALESCE(<exp1>, <exp2>, …, <expn>)
is similar to the following. CASE WHEN <exp1> IS NOT NULL THEN <exp1> WHEN <exp2> IS NOT NULL THEN <exp2> … WHEN <expn> IS NOT NULL THEN <expn> ELSE NULL END |
|
NULL + ' word' returns NULL
If you want to substitute NULL with an empty string you can use option 2 |
CONCAT unlike the + operator substitutes a NULL input with an empty string
DECLARE @ST AS VARCHAR(10) = NULL SELECT CONCAT( @ST , N'Work') as work -- returns work |
|
DATALENGTH function (vs LEN)
|
returns the length of the input in terms of number of bytes
LEN function returns the length of an input string in terms of the number of characters DATALENGTH(N'xyz') returns 6 LEN(N'xyz') returns 3 |
|
REPLICATE function
|
allows you to replicate an input string a requested number of
times. For example, the expression REPLICATE('0', 10) replicates the string '0' ten times, returning '0000000000'. |
|
STUFF function
|
STUFF function operates on an input string provided as the first argument; then, from the character position indicated as the second argument, deletes the number of characters
indicated by the third argument. Then it inserts in that position the string specified as the fourth argument. For example, the expression STUFF(',x,y,z', 1, 1, '') removes the first character from the input string, returning 'x,y,z'. |
|
there’s no TRIM function that removes both leading and trailing spaces; to achieve this, you need to
|
to trim you need to nest one function call within another, as in RTRIM(LTRIM(<input>)).
|
|
FORMAT function
|
you can format an input value based on a format string, and optionally specify the culture as a third input where relevan
FORMAT(1759, '000000000') formats the input number as a character string with a fixed size of 10 characters with leading zeros, returning '0000001759'. |
|
COALESCE function
|
COALESCE function accepts a list of expressions as input and returns the first that is not NULL, or NULL if all are NULLs. For example, the expression COALESCE(NULL, 'x', 'y') returns 'x'
ISNULL that is similar to the standard |
|
standard NULLIF function
|
function accepts two input expressions, returns NULL if they are equal, and returns the first input if they are not.
For example, consider the expression NULLIF(col1, col2). If col1 is equal to col2, the function returns a NULL; other wise, it returns the col1 value. |
|
What is the difference between NEWID and NEWSEQUENTIALID?
|
The NEWID function generates GUID values in random order, whereas the NEWSEQUENTIAL ID function generates GUIDs that increase in a sequential order.
|
|
Which function returns the current date and time value as a DATETIME2 type?
|
The SYSDATETIME function.
|
|
What is the difference between the simple CASE expression and the searched CASE expression?
a. The simple CASE expression is used when the database recovery model is simple, and the searched CASE expression is used when it’s full or bulk logged. B. The simple CASE expression compares an input expression to multiple possible expressions in the WHEN clauses, and the searched CASE expression uses independent predicates in the WHEN clauses. c. The simple CASE expression can be used anywhere in a query, and the searched CASE expression can be used only in the WHERE clause. D. The simple CASE expression can be used anywhere in a query, and the searched CASE expression can be used only in quer y filters (ON, WHERE, HAVING). |
B correct: The difference between the two is that the simple form compares expressions and the searched form uses predicates
|
|
three-valued logic (def)
|
when NULLs are possible in the data, a predicate can evaluate to true, false, and unknown. This type of logic is known as three-valued logic
|
|
empid firstname lastname country region city
------ ---------- ------------- -------- ------- --------- 1 Sara Davis USA WA Seattle 2 Don Funk USA WA Tacoma 3 Judy Lew USA WA Kirkland 4 Yael Peled USA WA Redmond 5 Sven Buck UK NULL London 6 Paul Suurs UK NULL London 7 Russell King UK NULL London Correct query to return all employees not i WA state |
SELECT empid, firstname, lastname, country, region, city
FROM HR.Employees WHERE region <> N'WA' OR region IS NULL; If you use SELECT empid, firstname, lastname, country, region, city FROM HR.Employees WHERE region <> N'WA'; you will get an empty set since NULL <> WA returns unknown result |
|
Predicates need to be in a form known as ________
|
search argument (SARG) to allow efficient use of indexes
A predicate in the form column operator value or value operator column can be a search argument. For example, predicates like col1 = 10, and col1 > 10 are search arguments Applying manipulation to the filtered column in most cases prevents the predicate from being a search argument. An example for manipulation of the filtered column is applying a function to it, as in F(col1) = 10, where F is some function |
|
you have a stored procedure that accepts an input parameter @dt representing an input shipped date. The procedure is supposed to return orders that were shipped on the input date. If the shippeddate column did not allow NULLs how will the query look
|
SELECT orderid, orderdate, empid
FROM Sales.Orders WHERE shippeddate = @dt; |
|
you have a stored procedure that accepts an input parameter @dt representing an input shipped date. The procedure is supposed to return orders that were shipped on the input date. If the shippeddate column did allow NULLs how will the query look
|
SELECT orderid, orderdate, empid
FROM Sales.Orders WHERE shippeddate = @dt OR (shippeddate IS NULL AND @dt IS NULL); ---- this is a second solution that is not optimal due to manipulation of the shippeddate column causing SQL not to use SARG |
|
Some precedence rules determine the logical evaluation order of the different predicates. Which operator precedes which (NOT, AND and OR)
|
The order is
NOT before AND/OR AND before OR () have the highest precedence |
|
Suppose that you need to query the Sales.Orders table and return only orders placed on February 12, 2007. You use the following query.
SELECT orderid, orderdate, empid, custid FROM Sales.Orders WHERE orderdate = '02/12/07'; Using '02/12/2007' will interpret the date differently based on the logon default language (for British this is dec 02 2007) How to fix this? |
1) use language-neutral literal like '20070212' which is always ymd, regardless of your languge
SELECT orderid, orderdate, empid, custid FROM Sales.Orders WHERE orderdate = '20070212'; 2) use CONVERT or PARSE func where you can indicate how you want SQL Server to interpret the literal CONVERT func support style number PARSE func support indicating a culture name |
|
suppose that you need to filter only orders placed in February 2007. You can use the YEAR and MONTH functions, as in the following.
SELECT orderid, orderdate, empid, custid FROM Sales.Orders WHERE YEAR(orderdate) = 2007 AND MONTH(orderdate) = 2; because here you apply manipulation to the filtered column, the predicate is not considered a search argument, and therefore, SQL Server won’t be able to rely on index ordering. You could revise your predicate as a range, like the following |
SELECT orderid, orderdate, empid, custid
FROM Sales.Orders WHERE orderdate >= '20070201' AND orderdate < '20070301'; |
|
You can indicate the ordering direction on an expression-by-expression basis like
|
ORDER BY col1 DESC, col2, col3 DESC (col1 descending, then col2 ascending, then col3 descending)
|
|
You can specify a percent of rows to filter (instead of number) syntax
|
SELECT TOP (1) PERCENT orderid, orderdate, custid, empid
FROM Sales.Orders ORDER BY orderdate DESC; |
|
backwards compatible syntax for SELECT TOP
|
SELECT TOP (3) orderid, orderdate, custid, empid
FROM Sales.Orders ORDER BY orderdate DESC; T-SQL supports specifying the number of rows to filter using the TOP option in SELECT queries without parentheses, but that’s only for backward compatibility reasons. The correct syntax is with parentheses. |
|
You don’t always care about guaranteeing deterministic or repeatable results for SELECT TOP; but if you do, two options are available to you. One option is to ask to include all ties with the last row by adding the WITH TIES option, as follows
|
SELECT TOP (3) WITH TIES orderid, orderdate, custid, empid
FROM Sales.Orders ORDER BY orderdate DESC; This can of course, this could result in returning more rows than you asked for |
|
OFFSET clause
FETCH clause |
indicating how many rows you want to skip (0 if you don’t want to skip any);
indicating how many rows you want to filter SELECT orderid, orderdate, custid, empid FROM Sales.Orders ORDER BY orderdate DESC, orderid DESC OFFSET 50 ROWS FETCH NEXT 25 ROWS ONLY; |
|
keywords NEXT or FIRST for
FETCH NEXT and FETCH FIRST |
When skipping some rows, it might be more intuitive to you to use the keywords FETCH NEXT to indicate how many rows to filter; but when not skipping any rows, it might be more intuitive to you to use the keywords
FETCH FIRST, as follows. SELECT orderid, orderdate, custid, empid FROM Sales.Orders ORDER BY orderdate DESC, orderid DESC OFFSET 0 ROWS FETCH FIRST 25 ROWS ONLY; |
|
OFFSET clause doesnt require FETCH clause
|
by indicating OFFSET clause you are requesting to skip some rows but then by not indicating a FETCH clause your are requesting to return all remaining rows
For example, the following query requests to skip 50 rows SELECT orderid, orderdate, custid, empid FROM Sales.Orders ORDER BY orderdate DESC, orderid DESC OFFSET 50 ROWS; |
|
What is SQL standard OFFSET-FETCH or TOP
What are the benefits of using OFFSET-FETCH over TOP? |
OFFSET-FETCH
OFFSET-FETCH is standard and TOP isn’t; also, OFFSET-FETCH supports a skipping capability that TOP doesn’t. |
|
How do you guarantee deterministic results with TOP?
|
By either returning all ties by using the WITH TIES option or by defining unique ordering to break ties.
|
|
You are tasked with writing a query against the Production.Products table, returning the five most expensive products from category 1. Write the following query.
SELECT TOP (5) productid, unitprice FROM Production.Products WHERE categoryid = 1 ORDER BY unitprice DESC; productid unitprice ---- --------- 38 263.50 43 46.00 2 19.00 1 18.00 35 18.00 |
This query returns the desired result, except it doesn’t have any handling of ties. In other words, the ordering among products with the same unit price is nondeterministic. Below is the deterministic query
SELECT TOP (5) WITH TIES productid, unitprice FROM Production.Products WHERE categoryid = 1 ORDER BY unitprice DESC |
|
You execute a query with a TOP (3) option. Which of the following options most accurately describes how many rows will be returned?
A. Fewer than three rows B. Three rows or fewer C. Three rows D. Three rows or more E. More than three rows F. Fewer than three, three, or more than three rows |
B. Correct: If there are fewer rows than three in the query result without TOP, the query will return only those rows. If there are three rows or more without TOP, the query will return three rows.
|
|
You execute a query with TOP (3) WITH TIES and nonunique ordering. Which of the following options most accurately describes how many rows will be returned?
A. Fewer than three rows B. Three rows or fewer C. Three rows D. Three rows or more E. More than three rows F. Fewer than three, three, or more than three rows |
F. Correct: If there are fewer rows than three in the query result without TOP, the query will return only those rows. If there are at least three rows in the result and no ties with the third, the query will return three rows. If there are more than three rows in the result, as well as ties with the third row, the query will return more than three rows.
|
|
Which of the following OFFSET-FETCH options are valid in T-SQL? (Choose all that apply.)
A. SELECT … ORDER BY orderid OFFSET 25 ROWS B. SELECT … ORDER BY orderid FETCH NEXT 25 ROWS ONLY C. SELECT … ORDER BY orderid OFFSET 25 ROWS FETCH NEXT 25 ROWS ONLY D. SELECT … <no ORDER BY> OFFSET 0 ROWS FETCH FIRST 25 ROWS ONLY |
A. Correct: T-SQL supports indicating an OFFSET clause without a FETCH clause.
C. Correct: T-SQL supports indicating both OFFSET and FETCH clauses. |
|
Cross Joins def
|
Cross Joins aka Cartesian product
|
|
Inner Joins def
|
inner join, you can match rows from two tables based on a predicate—usually one that compares a primary key value in one side to a foreign key value in another side.
|
|
outer joins def
|
you can request to preserve all rows from one or both sides of the join, never mind if there are matching rows in the other side based on the ON predicate.
|
|
Write a query that matches customers with their respective orders, returning only matches. You are not required to return customers with no related orders
C.custid, C.companyname, O.orderid, O.orderdate from Sales.Customer and Sales.Order |
SELECT C.custid, C.companyname, O.orderid, O.orderdate
FROM Sales.Customers AS C INNER JOIN Sales.Orders AS O ON C.custid = O.custid; |
|
Return only customers without orders
C.custid, C.companyname, O.orderid, O.orderdate from Sales.Customer and Sales.Order |
SELECT C.custid, C.companyname, O.orderid, O.orderdate
FROM Sales.Customers AS C LEFT OUTER JOIN Sales.Orders AS O ON C.custid = O.custid WHERE O.orderid IS NULL; |
|
What are the definitions for a subquery in terms of result of the subquery
|
scalar
multi-valued table-valued |
|
What subqueries can be in relation to the outer query
|
self-contained - independent of the outer query
correlated - having a reference to a columnt from the table in the outer query |
|
Self-contained subqueries (def)
|
are subqueries that have no dependency on the outer query
you can highlight the inner query and run it independently |
|
Correlated subqueries (def)
|
are subqueries where the inner query has a reference to a column from the table in the ourter query
|
|
EXISTS predicate
|
EXISTS predicate accepts a subquery as input and returns true when the subquery returns at least one row and false otherwise
the following query returns customers who placed orders on February 12, 2007. SELECT custid, companyname FROM Sales.Customers AS C WHERE EXISTS (SELECT * FROM Sales.Orders AS O WHERE O.custid = C.custid AND O.orderdate = '20070212'); |
|
Table expression (def)
|
Table expressions are named queries. You write an inner query that returns a relational result set, name it, and query it from an outer query. T-SQL supports four forms of table expressions:
■■ Derived tables ■■ Common table expressions (CTEs) ■■ Views (definition is preserved) ■■ Inline table-valued functions (definition is preserved) |
|
common table expression (CTE)
|
a named table expression that is visible only to the statement that defines it a query against a CTE involves three main parts
■■The inner query ■■ The name you assign to the query and its columns ■■ The outer query |
|
CTEs sample syntax
|
WITH <CTE_name>
AS ( <inner_query> ) <outer_query>; |
|
CTEs real example
|
WITH C AS ( SELECT
ROW_NUMBER() OVER(PARTITION BY categoryid ORDER BY unitprice, productid) AS rownum, categoryid, productid, productname, unitprice FROM Production.Products ) SELECT categoryid, productid, productname, unitprice FROM C WHERE rownum <= 2; categoryid|productid|productname|unitprice 1 |24 |Product QOGNU|4.50 1 |75 |Product BWRLG |7.75 2 |3 |Product IMEHJ |10.00 2 |77 |Product LUNZZ |13.00 .... |
|
derived tables and CTE vs views and table-valued function
|
derived tables and CTEs are table expressions that are visible only in the scope of the statement that defines them while
views and table-valued functions are stored as object in the db (being in an object in the database allows you to define permissions) |
|
difference between views and inline table-valued functions is that
|
views dont accept input parameters
table-valued functions do accept a parameter |
|
APPLY operator
|
use to apply a table expression given to its as the right input to each for from a table expressing given to it as the left input
if compared to JOIN - APPLY allow the right table expression to be correlated to the left table So conceptually, the right table expression is evaluated separately for each left row. |
|
CROSS APPLY operator
|
operates on left and right table expressions as inputs. The right table expression can have a correlation to elements from the left table. The right table expression is applied to each row from the left input. What’s special about the CROSS APPLY operator as compared to OUTER APPLY is that if the right table expression returns an empty set for a left row, the left row isn’t returned
|
|
OUTER APPLY operator
|
oes what the CROSS APPLY operator does, but also includes in the result rows from the left side that get an empty set back from the right side
|
|
What is the difference between the APPLY and JOIN operators?
|
With a JOIN operator, both inputs represent static relations. With APPLY, the left side is a static relation, but the right side can be a table expression with correlations to elements from the left table.
|
|
T-SQL supports three set operators (list)
|
UNION,
INTERSECT, and EXCEPT; it also supports one multiset operator: UNION ALL |
|
What is the difference between UNION and UNION ALL
|
if you want to keep duplicates you use UNION ALL
otherwise use UNION |
|
Set operators precedence:
|
INTERSECT
UNION, EXCEPT (considered equal) |
|
HAVING vs WHERE (def)
|
unlike the WHERE clause, which evaluated at the row level
HAVING clauses uses a predicate per group as opposed to per row |
|
What is the difference between the COUNT(*) aggregate function and the COUNT(<expression>) general set function?
A. COUNT(*) counts rows; COUNT(<expression>) counts rows where <expression> is not NULL. B. COUNT(*) counts columns; COUNT(<expression>) counts rows. C. COUNT(*) returns a BIGINT; COUNT(<expression>) returns an INT. D. There’s no difference between the functions. |
a. The COUNT(*) function doesn’t operate on an input expression; instead, it counts the number of rows in the group
The COUNT(<expression>) function operates on an expression and ignores NULLs |
|
from a logical query processing perspective what is the order of clause execution
|
FROM
WHERE GROUP BY HAVING SELECT ORDER BY |
|
Following query
SELECT S.shipperid, S.companyname, COUNT(*) AS numorders FROM Sales.Shippers AS S JOIN Sales.Orders AS O ON S.shipperid = O.shipperid GROUP BY S.shipperid; ------------ returns ----------------- Msg 8120, Level 16, State 1, Line 1 Column 'Sales.Shippers.companyname' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. ---------------- WHY ---------------- |
S.companyname column neither appear in the GROUP BY list nor is it contained in an aggregate function, it's not allowed in the HAVING, SELECT and ORDER BY clauses
----------- WORKAROUND ---------------- SELECT S.shipperid, S.companyname, COUNT(*) AS numorders FROM Sales.Shippers AS S INNER JOIN Sales.Orders AS O ON S.shipperid = O.shipperid GROUP BY S.shipperid, S.companyname; |
|
T-SQL supports three clauses that allow defined multiple define multiple grouping sets in the same query (list):
|
GROUPING SETS
CUBE ROLLUP |
|
You can use the GROUPING SETS clause to list all grouping sets that you want to define in the query. As an example, the following query defines four grouping sets.
|
SELECT shipperid, YEAR(shippeddate) AS shipyear, COUNT(*) AS numorders
FROM Sales.Orders GROUP BY GROUPING SETS ( ( shipperid, YEAR(shippeddate) ), ( shipperid ), ( YEAR(shippeddate) ), ( ) ); |
|
CUBE clause
|
accepts a list of expressions as inputs and defines all possible grouping sets that can be generated from the inputs—including the empty grouping set
SELECT shipperid, YEAR(shippeddate) AS shipyear, COUNT(*) AS numorders FROM Sales.Orders GROUP BY CUBE( shipperid, YEAR(shippeddate) ); The CUBE clause defines all four possible grouping sets from the two inputs: 1. ( shipperid, YEAR(shippeddate) ) 2. ( shipperid ) 3. ( YEAR(shippeddate) ) 4. ( ) |
|
What makes a query a grouped query?
What are the clauses that you can use to define multiple grouping sets in the same query? |
When you use an aggregate function, a GROUP BY clause, or both.
GROUPING SETS, CUBE, and ROLLUP. |
|
What is the restriction that grouped queries impose on your expressions?
A. If the query is a grouped query, you must invoke an aggregate function. B. If the query has an aggregate function, it must have a GROUP BY clause. C. The elements in the GROUP BY clause must also be specified in the SELECT clause. D. If you refer to an element from the queried tables in the HAVING, SELECT, or ORDER BY clauses, it must either appear in the GROUP BY list or be contained by an aggregate function. |
D Correct: A grouped query returns only one row per group. For this reason, all expressions that appear in phases that are evaluated after the GROUP BY clause
(HAVING, SELECT, and ORDER BY) must guarantee returning a single value per group. That’s where the restriction comes from. |
|
What is the purpose of the GROUPING and GROUPING_ID functions? (Choose all that apply.)
A. You can use these functions in the GROUP BY clause to group data. B. You can use these functions to tell whether a NULL in the result represents a placeholder for an element that is not part of the grouping set or an original NULL from the table. C. You can use these functions to uniquely identify the grouping set that the result row is associated with. D. These functions can be used to sort data based on grouping set association—that is, first detail, and then aggregates. |
Correct Answers: B, C, and D
|
|
What is the difference between the COUNT(*) aggregate function and the
COUNT(<expression>) general set function? A. COUNT(*) counts rows; COUNT(<expression>) counts rows where <expression> is not NULL. B. COUNT(*) counts columns; COUNT(<expression>) counts rows. C. COUNT(*) returns a BIGINT; COUNT(<expression>) returns an INT. D. There’s no difference between the functions. |
Correct: The COUNT(*) function doesn’t operate on an input expression; instead, it counts the number of rows in the group. The COUNT(<expression>) function operates on an expression and ignores NULLs. Interestingly, COUNT(<expression>) returns 0 when all inputs are NULLs, whereas other general set functions like MIN, MAX, SUM, and AVG return a NULL in such a case.
|
|
Pivoting def
Unpivoting def |
Pivoting is a specialized case of grouping and aggregating of data.
When you pivot data, you need to identify three things: the grouping element, spreading element, and aggregation element Unpivoting is, in a sense, the inverse of pivoting. |
|
all pivot queries, you need to identify three elements
|
¡¡ What do you want to see on rows? This element is known as the on rows, or grouping element.
¡¡ What do you want to see on columns? This element is known as the on cols, or spreading element. ¡¡ What do you want to see in the intersection of each distinct row and column value? This element is known as the data, or aggregation element. |
|
PIVOT general syntax
|
WITH PivotData AS
( SELECT < grouping column >, < spreading column >, < aggregation column > FROM < source table > ) SELECT < select list > FROM PivotData PIVOT( < aggregate function >(< aggregation column >) FOR < spreading column > IN (< distinct spreading values >) ) AS P; |
|
What function isn't allowed as the aggregate function used by the PIVOT operator?
PIVOT operator limitation for aggregate functions usage? |
COUNT(*)
you can use COUNT(<col name>) PIVOT operator is limited to using only one aggregate functions |
|
UNPIVOT general syntax
|
SELECT < column list >, < names column >, < values column >
FROM < source table > UNPIVOT( < values column > FOR < names column > IN( <source columns> ) ) AS U; |
|
What UNPIVOT does with rows that contain NULL values?
What type of language constructs are PIVOT and UNPIVOT implemented as? |
Teh assumption is htat hose represent inapplicable cases
PIVOT and UNPIVOT are implemented as table operators. |
|
Which of the following are not allowed in the PIVOT operator’s specification? (Choose all that apply.)
A. Specifying a computation as input to the aggregate function B. Specifying a computation as the spreading element C. Specifying a subquery in the IN clause D. Specifying multiple aggregate functions |
Correct answer all
|
|
Window aggregate functions list
|
are the same as the group aggregate functions (for example, SUM, COUNT, AVG, MIN, and MAX), except window aggregate functions are applied to a window of rows defined by the OVER clause
expression SUM(val) OVER() represents the grand total of all rows in the underlying query SUM(val) OVER(PARTITION BY custid) represents the current customer’s total. |
|
SELECT custid, orderid,
val, SUM(val) OVER(PARTITION BY custid) AS custtotal, SUM(val) OVER() AS grandtotal FROM Sales.OrderValues; query against the Sales.OrderValues view returning for each order the customer ID, order ID, and order value; using window functions, the query also returns the grand total of all values and the customer total |
custid orderid val custtotal grandtotal
------- -------- ------- ---------- ----------- 1 ||10643| 814.50| 4273.00| 1265793.22 1 ||10692| 878.00| 4273.00| 1265793.22 1 ||10702| 330.00| 4273.00| 1265793.22 1 ||10835| 845.80| 4273.00| 1265793.22 1 ||10952| 471.20| 4273.00| 1265793.22 1 ||11011| 933.50| 4273.00| 1265793.22 2 ||10926| 514.40| 1402.95| 1265793.22 2 ||10759| 320.00| 1402.95| 1265793.22 |
|
Window aggregate functions support another filtering option called framing (def)
|
define ordering within the partition by using a window order clause, and then based on that order, you can confine a frame of rows between two delimiters
You define the delimiters by using a window frame clause. The window frame clause requires a window order clause to be present because a set has no order, and without order, limiting rows between two delimiters would have no meaning. |
|
ROWS window frame unit, you can indicate the delimiters as one of three options:
|
■■ UNBOUNDED PRECEDING or FOLLOWING, meaning the beginning or end of the partition,
respectively ■■ CURRENT ROW, obviously representing the current row ■■ <n> ROWS PRECEDING or FOLLOWING, meaning n rows before or after the current, respectively |
|
suppose that you wanted to query the Sales.OrderValues view and compute the running total values from the beginning of the current customer’s activity until the current order.
You need to use the SUM aggregate You partition the window by custid You order the window by orderdate, orderid. You then frame the rows from the beginning of the partition (UNBOUNDED PRECEDING) until the current row |
SELECT custid, orderid, orderdate, val,
SUM(val) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS runningtotal custid orderid orderdate val runningtotal ------- -------- ----------- ------- ------------- 1 10643 2007-08-25 814.50 814.50 . . 1 11011 2008-04-09 933.50 4273.00 2 10308 2006-09-18 88.80 88.80 . . 2 10926 2008-03-04 514.40 1402.95 |
|
you need to filter the result of the last query, returning only those rows where the running total is less than 1,000.00. The following code achieves this by defining a common table expression (CTE)
Because window functions are supposed to operate on the underlying query’s result set, they are allowed only in the SELECT and ORDER BY clauses. If you need to refer to the result of a window function in any clause that is evaluated before the SELECT clause, you need to use a table expression |
WITH RunningTotals AS (
SELECT custid, orderid, orderdate, val, SUM(val) OVER(PARTITION BY custid ORDER BY orderdate, orderid ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS runningtotal FROM Sales.OrderValues ) SELECT * FROM RunningTotals WHERE runningtotal < 1000.00; custid orderid orderdate val runningtotal ------- -------- ----------- ------- ------------- 1 10643 2007-08-25 814.50 814.50 2 10308 2006-09-18 88.80 88.80 2 10625 2007-08-08 479.75 568.55 2 10759 2007-11-28 320.00 888.55 3 10365 2006-11-27 403.20 403.20 |
|
example for a window frame extent, if you wanted the frame to include only the last three rows, you would use the form
|
ROWS BETWEEN 2 PRECEDING AND CURRENT
ROW. |
|
ROWS vs. RANGE
|
In SQL Server 2012, the ROWS option usually gets optimized much better than RANGE when using the same delimiters
|
|
Window Ranking Functions
|
window ranking functions, you can rank rows within a partition based on specified ordering
T-SQL supports four window ranking functions: ROW_NUMBER, RANK, DENSE_RANK, and NTILE. |
|
Ranking functions example
|
SELECT custid, orderid, val,
ROW_NUMBER() OVER(ORDER BY val) AS rownum, RANK() OVER(ORDER BY val) AS rnk, DENSE_RANK() OVER(ORDER BY val) AS densernk, NTILE(100) OVER(ORDER BY val) AS ntile100 FROM Sales.OrderValues; |
|
ROW_NUMBER function def
|
computes a unique sequential integer starting with 1 within the window partition based on the window ordering.
Function is not deterministic, and if there is no explicit tiebreaker which row gets which number is based on optimization |
|
RANK function def
|
RANK function returns the number of rows in the partition that have a lower ordering value than the current, plus 1. For example, consider the rows in the sample query’s result that have an ordering value of 45.00. Nine rows have ordering values that are lower than 45.00; hence, these rows got the rank 10
(9 + 1). |
|
DENSE_RANK function
|
returns the number of distinct ordering values that are lower than the current, plus 1. For example, the same rows that got the rank 10 got the dense rank 9. That’s because these rows have an ordering value 45.00, and there are eight distinct ordering values that are lower than 45.00
|
|
NTILE function
|
you can arrange the rows within the partition in a requested number of equally sized tiles, based on the specified ordering. You specify the desired number of tiles as input to the function. In the sample query, you requested 100 tiles. There are 830 rows in the result set, and hence the base tile size is 830 / 100 = 8 with a remainder of 30. Because there is a remainder of 30, the first 30 tiles are assigned with an additional row.
|
|
T-SQL supports the following window offset functions
|
LAG, LEAD, FIRST_VALUE, and LAST_VALUE.
|
|
SELECT country, YEAR(hiredate) AS yearhired
FROM HR.Employees WHERE yearhired >= 2003; This query fails with the following error. Msg 207, Level 16, State 1, Line 3 Invalid column name 'yearhired'. Why? |
A typical mistake made by people who don’t understand logical query processing is attempting to refer in the WHERE clause to a column alias defined in the SELECT clause. This isn’t allowed because the WHERE clause is evaluated before the SELECT clause.
|
|
main query clauses specified in the order that you are supposed to type them (known as “keyed-in order”):
logical query processing order of the six main query clauses: |
1. SELECT
2. FROM 3. WHERE 4. GROUP BY 5. HAVING 6. ORDER BY logical 1. FROM 2. WHERE 3. GROUP BY 4. HAVING 5. SELECT 6. ORDER BY |
|
What is the difference between
CAST, CONVERT, PARSE and TRY_CAST, TRY_CONVERT, TRY_PARSE |
Those without TRY fail if the value isn't convertible
Those with TRY return a NULL if the value isn't convertible |
|
CASE Example
|
SELECT productid, productname, unitprice, discontinued,
CASE discontinued WHEN 0 THEN 'No' WHEN 1 THEN 'Yes' ELSE 'Unknown' END AS discontinued_desc FROM Production.Products; |
|
typical use of COALESCE is to substitute a NULL with something else give example
|
COALESCE(region, '') returns region if it’s not NULL and returns an empty string if it is NULL.
|
|
Two tables T1 and T2 and you need to join them based on a match between T1.col1 and T2.col1. The attributes do allow NULLs (col1 and col2 are integer)
you might try to use the predicate COALESCE(T1.col1, -1) = COALESCE(T2. col1, -1), or ISNULL(T1.col1, -1) = ISNULL(T2.col1, -1) What is the problem? |
because you apply manipulation to the attributes you’re comparing, SQL Server will not rely on index ordering. This can result in not using available indexes efficiently. Instead, it is recommended to use the longer form:
T1.col1 = T2.col1 OR (T1.col1 IS NULL AND T2.col1 IS NULL), |
|
IIF(<predicate>, <true_result>, <false_or_unknown_result>)
is equivalent IIF is non-standard used for MS Access |
CASE WHEN <predicate> THEN <true_result> ELSE <false_or_unknown_result> END
|
|
CHOOSE function allows
is non-standard used for MS Access |
CHOOSE(<pos>, <exp1>, <exp2>, …, <expn>)
example, the expression CHOOSE(2, 'x', 'y', 'z') returns 'y' |
|
way to compute the end of the current month
way to compute the end of the current year |
SELECT EOMONTH(SYSDATETIME()) AS end_of_current_month
SELECT DATEFROMPARTS(YEAR(SYSDATETIME()), 12, 31) AS end_of_current_year; |
|
Write a query against the Production.Products table that returns the existing numeric product ID, in addition to the product ID formatted as a fixed-sized string with 10 digits with leading zeros. For example, for product ID 42, you need to return the string '0000000042'.
|
SELECT productid,
FORMAT(productid, 'd10') AS str_productid FROM Production.Products; |
|
Optimize wiht search arguments following query
SELECT orderid, orderdate, empid, custid FROM Sales.Orders WHERE YEAR(orderdate) = 2007 AND MONTH(orderdate) = 2; |
because here you apply manipulation to the filtered column, the predicate is not considered a search argument, and therefore, SQL Server won’t be able to rely on index ordering. You could revise your predicate as a range, like the following.
SELECT orderid, orderdate, empid, custid FROM Sales.Orders WHERE orderdate >= '20070201' AND orderdate < '20070301'; Now that you don’t apply manipulation to the filtered column, the predicate is considered a search argument, and there’s the potential for SQL Server to rely on index ordering. |
|
Standard SQL supports the options _________________ and ______________________ to control how NULLs sort, but T-SQL doesnt support this optiopn
According to standard SQL, a query with an ORDER BY clause conceptually returns a ___________ not a relation |
NULL FIRST and
NULL LAST cursor |
|
Note that the form '2007-02-12' is considered language-neutral only for the data types
|
types DATE, DATETIME2, and DATETIMEOFFSET
this form is considered language-dependent for the types DATETIME and SMALLDATETIME |
|
suppose that you need to filter only orders placed in
February 2007. You can use the YEAR and MONTH functions, as in the following |
SELECT orderid, orderdate, empid, custid
FROM Sales.Orders WHERE YEAR(orderdate) = 2007 AND MONTH(orderdate) = 2; However, because here you apply manipulation to the filtered column, the predicate is not considered a search argument SELECT orderid, orderdate, empid, custid FROM Sales.Orders WHERE orderdate >= '20070201' AND orderdate < '20070301'; |
|
You can sort by elements that do not appear in the SELECT list unless the ______________ clause is also specified
|
DISTINCT
|
|
In addition to supporting the syntax for cross joins with the CROSS JOIN keyword both SQL and TSQL support an older syntax where you specify _______________
|
a comma between the names as in FROM T1, T2
|
|
When talking about INNER JOIN
T1 JOIN T2 is equal to |
T1 INNER JOIN T2
also LEFT OUTER JOIN <> LEFT JOIN RIGHT OUTER JOIN <> RIGHT JOIN FULL OUTER JOIN <> FULL JOIN |
|
LEFT OUTER JOIN
|
LEFT OUTER JOIN (or LEFT JOIN) preserves the left table
returns rows from the left that had no matches in the right table (call those outer rows), with NULLs used as placeholders in the right side |
|
What plays matching role in an OUTER JOIN
|
the ON clause play a matching role aka a row in the preserved side will be returned whether the ON predicate finds a match for it or not
WHERE clause plays a final filtering role to determine which rows will be presented |
|
RIGHT OUTER JOIN
FULL OUTER JOIN |
RIGHT OUTER JOIN (or RIGHT JOIN in short) preserves the right side
T-SQL also supports a full outer join (FULL OUTER JOIN, or FULL JOIN in short), that preserves both sides. |
|
multi-join query evaluates the joins conceptually from
|
left to right, result of one join is used as the left input to the next join.
|
|
Note that if what’s supposed to be a scalar subquery returns in practice more than one value,
|
the code fails at run time
f the scalar subquery returns an empty set, it is converted to a NULL. |
|
Requirements for the internal query when working with TE
|
because a table expression is supposed to represent a relation, the inner query defining it needs to be
relational - all columns returned by inner query need name - all columns must be unique - inner query is not allowed ORDER BY |
|
Derived Tables def
|
a form of TE that closely resembles a subquery that returns an entire table result but a subquery that returns an entire table result
You define the derived table’s inner query in parentheses in the FROM clause of the outer query, and specify the name of the derived table after the parentheses. |
|
SELECT categoryid, productid, productname, unitprice
FROM (SELECT ROW_NUMBER() OVER(PARTITION BY categoryid ORDER BY unitprice, productid) AS rownum, categoryid, productid, productname, unitprice FROM Production.Products) AS D WHERE rownum <= 2; |
the derived table is defined in the FROM clause of the outer query in parentheses, followed by the derived table name.
Then the outer query is allowed to refer to column aliases that were assigned by the inner query. |
|
You don’t nest CTEs like you do derived tables. If you need to define multiple CTEs, you simply separate them by commas.
Sample code |
WITH C1 AS (
SELECT ... FROM T1 WHERE ... ), C2 AS ( SELECT FROM C1 WHERE ... ) SELECT ... FROM C2 WHERE ...; |
|
CTEs also have a recursive form sample
|
The body of the recursive query has two or more queries, usually separated by a UNION ALL operator
WITH EmpsCTE AS ( SELECT empid, mgrid, firstname, lastname, 0 AS distance FROM HR.Employees WHERE empid = 9 UNION ALL SELECT M.empid, M.mgrid, M.firstname, M.lastname, S.distance + 1 AS distance FROM EmpsCTE AS S JOIN HR.Employees AS M ON S.mgrid = M.empid ) SELECT empid, mgrid, firstname, lastname, distance FROM EmpsCTE; |
|
notes about recursive form of CTE
|
1 of the query is called anchor member - returns a valid relational result, invoked once
At least 1 of the query is called recursive member - has a reference to the CTE name, invoked repeatedly until it returns an empty result set |
|
example for a typical correlated sub-query. In this example we are finding the list of employees (employee number and names) having more salary than the average salary of all employees in that employee's department.
|
ELECT employee_number, name
FROM employee AS e1 WHERE salary > (SELECT avg(salary) FROM employee WHERE department = e1.department); |
|
sample TE
|
SELECT categoryid, productid, productname, unitprice
FROM (SELECT ROW_NUMBER() OVER(PARTITION BY categoryid ORDER BY unitprice, productid) AS rownum, categoryid, productid, productname, unitprice FROM Production.Products) AS D WHERE rownum <= 2; |
|
Syntax to crate a view
|
CREATE VIEW Sales.RankedProducts
AS SELECT ROW_NUMBER() OVER(PARTITION BY categoryid ORDER BY unitprice, productid) AS rownum, categoryid, productid, productname, unitprice FROM Production.Products; GO Note that it’s not the result set of the view that is stored in the database; rather, only its definition is stored. SELECT categoryid, productid, productname, unitprice FROM Sales.RankedProducts WHERE rownum <= 2; |
|
inline table-valued functions def
|
similar to view in concepts , however as mentioned, they do support input parameters
|
|
inline table-valued functions sample
|
CREATE FUNCTION HR.GetManagers(@empid AS INT) RETURNS TABLE AS
RETURN WITH EmpsCTE AS ( SELECT empid, mgrid, firstname, lastname, 0 AS distance FROM HR.Employees WHERE empid = @empid UNION ALL SELECT M.empid, M.mgrid, M.firstname, M.lastname, S.distance + 1 AS distance FROM EmpsCTE AS S JOIN HR.Employees AS M ON S.mgrid = M.empid ) SELECT empid, mgrid, firstname, lastname, distance FROM EmpsCTE; GO SELECT * FROM HR.GetManagers(9) AS M; |
|
APPLY operator
|
powerful operator that you can use to apply a table expression given to it as the right input to each row from a table expression given to it as the left input
The two forms of the APPLY operator—CROSS and OUTER—are described in the next sections. |
|
Suppose you query a table with an attribute called fullname formatted as '<first> <last>', and you need to write an expression that extracts the first name part
PATINDEX func. expl. |
LEFT(fullname, CHARINDEX(' ', fullname) -1)
PATINDEX used for when you are looking for a pattern in a string |
|
The following example finds the position at which the pattern ensure starts in a specific row of the DocumentSummary column in the Document table.
|
USE AdventureWorks2012;
GO SELECT PATINDEX('%ensure%',DocumentSummary) FROM Production.Document WHERE DocumentNode = 0x7B40; GO |