Using DML Statements – Notes from MVA Database Fundamentals

https://mva.microsoft.com/en-US/training-courses/database-fundamentals-8243?l=IAdTlyJy_004984382

Using DML Statements

I’ve been trying to get through this section of the course for a while now, so I’m basically forcing myself to take notes.

 ————–

DML – Data Manipulation Language

Common DML Statements:

SELECT – retrieves data
INSERT – adds data
UPDATE – modifies data
DELETE – removes data
BULK INSERT – imports a data file

The SELECT statement basically retrieves the data you request.

The most basic one is:

SELECT * FROM tablename;

This one will retrieve all rows and columns (including duplicates) from the table, “tablename.”

If you don’t want duplicate rows, you could use the following instead:

SELECT DISTINCT * from tablename;

I bring up the SELECT DISTINCT because in my work I almost always use SELECT DISTINCT rather than just SELECT, as I only very rarely want duplicates.

To limit the number of rows returned, one thing you can do is add a WHERE clause after the “FROM” clause that winnows down the data a bit more.

I’m only 2 minutes in, but it’s time for my bed. More notes tomorrow. 🙂

Advertisements

CASE WHEN

I ran into this today, and it looks really helpful for times when you need to have a certain value in a field which is based on (but not identical to) a value in one of your tables.

To illustrate: say you needed to pull data on a person which included whether or not said person was retired. The has “Retired” and “Working” as possible options in the “Retirement” field of the “People” table.

The Select query may look like this:

SELECT FirstName, LastName, DOB, Retirement 
FROM People;

But what if you were required to provide a code instead of “Retired” or “Working”? Maybe “Retired” needs to come back as “01” and “Working” needs to come back as “02.”

The way I would have dealt with this on my own is to make a tiny table with the codes and the retirement status in separate columns, then joined the table to the “People” table. But there’s another way – I think maybe a better way in terms of stress on the server’s CPU. (Is it better?)

It’s the “CASE WHEN” clause, and this is what it looks like:

SELECT FirstName,
                LastName,
                DOB,
                CASE
                          WHEN Retirement = ‘Retired’ THEN ’01’
                          WHEN Retirement = ‘Working’ THEN ’02’
                 END
                         AS RetirementStatus
FROM People;

I did find a real explanation of CASE modeanalytics.com – and it looks like it might be a good resource for learning SQL in general.

 

DROP IF EXISTS (DIE) pre-SQL Server 2016

SQL Server 2016 has a much more simple way to do this, but if you need to be able to re-run queries which create tables, and where the tables may or may not still exist, it can be difficult for a newbie to know how to do this gracefully.

If you run the queries and the table(s) already exist, your query errors out.

If you preface your query with a drop table query, and the table doesn’t exist, you also get stopped dead in your tracks.

SQL Server 2016 has a simple way to do this:

DROP TABLE IF EXISTS tablename;

My understanding is that this is not a new idea in the greater SQL world – it’s just newly implemented in SQL Server.

However, if you aren’t blessed with SQL Server 2016, there’s another way:

IF OBJECT_ID(‘[databasename].[schema].[tablename]’) IS NOT NULL DROP TABLE [databasename].[schema].[tablename];

What’s more, if you’re working with temp tables:

IF OBJECT_ID(‘tempdb..#temptablename’) IS NOT NULL DROP TABLE #tablename;

This is likely clunky for you smooth, practiced, experienced folks out there. But learning this was a god(dess?)-send for a newbie like myself. 🙂

Here’s a link from someone more practiced at these things than I:

DROP IF EXISTS

UNION INTO

So, this appears to have worked:

SELECT *
INTO [COPD].[dbo].[COPDCombinedGlucoseResults]
FROM [COPD].[dbo].[COPDLabGlucoseResults]
UNION
SELECT *
FROM [COPD].[dbo].[COPDFlowsheetGlucoseResults];

The new table has 156558 rows

The LabGlucose table had 137417 distinct rows and the Flowsheet Glucose table had 72986 distinct rows, so I think that means there were 53845 matching rows.

Just for grins, and because this was my first real-live UNION query, I thought it would be interesting to do an INTERSECT query to see if the 53845 number holds water and to test my newbie understanding of these queries:

SELECT *
FROM [COPD].[dbo].[COPDLabGlucoseResults]
INTERSECT
SELECT *
FROM [COPD].[dbo].[COPDFlowsheetGlucoseResults];

Lo and Behold: 53845 rows. So if you add the results of a UNION query to the results of an INTERSECT query, you will get the same number of rows as both tables separately.

I read today about the UNION ALL query, and it appeared to be the version of the UNION query that included duplicate rows.

By this logic, a UNION query + an INTERSECT query = a UNION ALL query. I must test this.

The number of rows from each table add up to: 210403

SELECT *
FROM [COPD].[dbo].[COPDLabGlucoseResults]
UNION ALL
SELECT *
FROM [COPD].[dbo].[COPDFlowsheetGlucoseResults];

and we have a winner! 210403 rows.

I think I might possibly be falling in love with SQL. I’ve almost forgotten my garden.

Import a table from Microsoft Access into SQL Server

Preliminary: Save your Access Database in the 2002-2003 file format. That means it must be a .mdb.

These are the steps for using the wizard to import the table:

  1. Right-click on your database, and choose Tasks –> Import Data..20160621 Import Step1 Tasks
  2. Choose “Microsoft Access (Microsoft Access Database Engine)” from the Data source dropdown menu in the SQL Server Import and Export Wizard.20160621 Import Step2 ImportExportWizard1
  3. Find your database using the Browse button unless you really just like typing things in. Then hit “Next.”
    20160621 Import Step3 ImportExportWizard2
  4. In the “Destination” field, choose “SQL Server Native Client 11.0.” Choose your Server Name, and fill in your Database name, and hit “Next.”
    20160621 Import Step4 ImportExportWizard3
  5. On the next page, choose “Copy data from one or more tables or views.” Unless you’re more fancy than me and want to try that second option. Then click “Next.”
    20160621 Import Step5 ImportExportWizard4
  6. Click on the box(es) for the table(s) you want. Then click “Edit Mappings.”
    20160621 Import Step6 ImportExportWizard5
  7. The default for these values was nvarchar which is a Unicode datatype, and not compatible with the data I hope to connect it to. Changed values to varchar. Click “OK” to go back to the previous window. You can also edit SQL if you’d like.
    20160621 Import Step7 ImportExportWizard6
  8. You can preview the data by clicking “Preview”
    20160621 Import Step8 ImportExportWizard7
  9. I didn’t change anything on this page. Hit “Next.”
    20160621 Import Step9 ImportExportWizard8
  10. Hit “Next” on this page too.
    20160621 Import Step10 ImportExportWizard9
  11. If everything is kosher, the file should upload correctly.
    20160621 Import Step11 ImportExportWizard10

Moving a Temp table to local server

So, I am still attempting to get my local server and the big server to cooperate. When I try to move a temp table to the local server (where I have import/export rights), I get the following error:

/*————————
USE COPD
GO

SELECT *
INTO COPD.dbo.InpatientStay
FROM [BIGSERVER].dbo.##COPDInpatientStay;
————————*/
Database name ‘BIGSERVER’ ignored, referencing object in tempdb.
Database name ‘BIGSERVER’ ignored, referencing object in tempdb.
Msg 208, Level 16, State 0, Line 4
Invalid object name ‘##COPDInpatientStay’.

Off to google this error…

(COPD is a database on my local machine.)

Today’s T-SQL Headache

Today I came into the office in hopes of re-creating the below report using SQL Server. First I thought it might be nice to upload a table into my local DB that had the list of departments at the location requested for creating the patient list.

COPD Report Criteria

First, I couldn’t connect to the local server. It was basically the same error as I got on my windows 10 computer at home, except this was my windows 7 computer at work. Thankfully, a facebook friend’s dad was able to help me figure out the issue. It turned out I needed to turn on TCP/IP and Named Pipes in the SQL Server Configuration Manager, and then restart the SQL Server Service. Yay!

But I still can’t seem to be able to get a query to work which uses tables in my local db and tables in the big server. I read that I needed to create a linked server, so I liked the big server to my local database in the “Server Objects” but I still haven’t found how to write a query in which fields from tables in both places don’t end up with invalid object names. So today – today was mostly frustration. The possibilities are exciting and I learned how to do some neat stuff, but I didn’t get to do the things I had hoped to do.

 

T-SQL Queries: UPDATE one table with data from another table

20160618 UPDATE one table with data from another table

This is something you would never want to do with these two tables in real life, but I wanted to try out the functionality so that tomorrow I could go up to work and try it on a temp table as part of a project we are currently working on.

I also want to try this with a

WHERE Sales.SalesPerson.BusinessEntityID = Sales.SalesPersonQuotaHistory.BusinessEntityID

instead of the “FROM” clause – though if this has a chance of working, I need to update the set line to:

SET Sales.SalesPerson.Bonus = Sales.SalesPersonQuotaHistory.Bonus

I guess we shall see. 🙂

the “left” and “right” query on either side of an INTERSECT/EXCEPT clause

 

EXCEPT and INTERCEPT clauses
EXCEPT and INTERCEPT

One thing I didn’t even notice in the last post was that the SELECT query above the EXCEPT/INTERSECT was referred to as the query “on the left,” and the one below the EXCEPT/INTERSECT was referred to as “on the right.”

This actually does make sense if you imagine reading these things left to right. However, it sounded like one of those little differences in meaning that could trip a newbie like me up in training. Thankfully the guys who did this one made sure to point out the terminology and what it meant.

The EXCEPT and INTERSECT clauses

I’d heard tell of the “EXCEPT clause,” but this is my first time seeing it. Isn’t it beautiful? I have wished before for such a function, but I didn’t expect it to look so clean on the page. In Access, I have jumped through a few hoops to achieve a result similar to what this appears to do.

20160615 EXCEPT and INTERSECT clauses

The INTERSECT Clause sounds pretty cool too. I think the equivalent of this definition (at least as far as data returned) in Access would be making inner joins to the fields you want and only pulling those fields down into the bottom.

20160615 Access INTERSECT equivalent

^ Here’s my attempt to represent my understanding of the INTERSECT if represented in Microsoft Access. But here’s the rub. If I were to write a SQL query for the above Access query, I think I would write it as follows:

20160615 Access INTERSECT equivalent back in SQL

This is a very different-looking construct. I went ahead and ran it both ways in SQL and got very different results. For the results of the query using the inner join, I got 72591 rows, and in the results using the INTERSECT query from the Microsoft training, I got a mere 238 rows. Looking closer, I notice that my INTERSECT query doesn’t have immediately obvious repeated values, and my Inner Join query does. I go back and add a “Group By” clause, and then my results match. In these queries using the AdventureWorks tables, both queries ran very quickly. I do not know how the dark underbelly of SQL Server may treat these two queries, so I don’t know if they are equivalent in CPU usage or just in data returned. Here are the two queries with the equivalent results, if anybody else is looking to play:

20160615 SQL Query Comparison
Both of these queries return 238 matching rows if using the AdventureWorks 2014 database.

And just to round out this post, I did go back into Microsoft Access, add the “Group By” and found 238 rows that way as well:

20160615 Access INTERSECT equivalen correctedt

All things being equal, I love the idea of the INTERSECT query.

The BETWEEN Clause

I’m watching Microsoft Virtual Academy’s Database Fundamental’s class – the Introduction to Core Database Concepts section, and there is an example of the BETWEEN clause using the AdventureWorks 2014 database:

SELECT BusinessEntityID, JobTitle, VacationHours
FROM HumanResources.Employee
WHERE VacationHours BETWEEN 75 AND 100;

I ran this query in my database, and retrieved 75 rows.The lowest number of hours in the VacationHours field is 75, and the highest is 99.

I get confused with what the word “BETWEEN” means in queries. In Access, I avoided using “Between” in favor of “> x and <y” in the criteria of a field, so that I knew exactly what I was asking for. Looking at the data returned in this query, I find myself asking again – is 100 included in “BETWEEN 75 AND 100”?

I check to see if the table has any values equal to 100 in the VacationHours field:

SELECT BusinessEntityID, JobTitle, VacationHours
FROM HumanResources.Employee
WHERE VacationHours = 100;

No data was returned, so that first query wasn’t excluding values of 100 necessarily – they just weren’t there to begin with.

Since I know the value 99 is in this field, I alter the query:

SELECT BusinessEntityID, JobTitle, VacationHours
FROM HumanResources.Employee
WHERE VacationHours BETWEEN 75 AND 99;

I again retrieved 75 rows, lowest number of hours in the VacationHours field is 75, and the highest is 99 – the same as the initial query results. I think I can reasonably conclude from this that “BETWEEN” includes that end number.

This seems easy enough for integers, and maybe with dates that are not datetimes(?), but if this were a decimal field, I don’t know what would happen to a value of 100.1 in the first query, or a value of 99.1 in the last. More to investigate.

How T-SQL Thinks? Sorting: Order By, Joins, Group By?

I started out my morning delightfully forgetful yesterday’s research mission.

As I fed my cats, made coffee, drove to work, and began settling into my day, I intermittently listened (and re-listened) to the first third of Episode 29 of SQL Data Partners Podcast with Carlos L Chacon. This episode is called, “You Thinking What I Am Thinking?” and it’s about how SQL Server “thinks.” The interview is with Brent Ozar, and I think it’s a really interesting interview even to a newbie like me whose understanding of SQL Server / T-SQL is so limited.

From the perspective of a person in my position who wants to turn out queries (often based on unique, one-time, complex requests for years’ worth of data from multiple tables) with results in the most finished way possible, it was a bit distressing to hear Mr. Ozar talk about how applying an ORDER BY to a query is like asking SQL Server to go through an the white pages in an old phone book and manually re-order all the names by first name – and how that would require going through and scribbling all the names down as you went to get them in the right order. .

On Sorting and Joins-

Mr. Ozar mentioned that when you join tables together, even if there isn’t an ORDER BY in the query, there is an “implicit order by” created by the join. Uh oh! In our group, we rarely have a need to run a query that doesn’t involve a few joins.

On Sorting and Group By –

I mentioned this podcast episode to my boss this morning as something he might also be interested in – and he wanted to know if GROUP BY was included in this general CPU strain scenario. My answer right now? I don’t know but it makes sense to me if Joins and ORDER BY would. Googling gave me very some rather complex results (for where I am.)

I don’t yet understand indices at all in SQL Server, and it looks like they play a significant role in the way these things work. I am looking forward to getting a better understanding of these things, and will report back.

 

 

Three Terms to Research

My boss came in today and asked if I knew about any of the three following terms as they apply to SQL Server:

  • Query Optimization
  • Performance Tuning
  • Execution Plan

He said they were not worth my investigating if I did not know them, but that the person in charge of our database had brought them up to him.

I don’t have a specific plan of study in mind, but I heard in a podcast (I think it was the “SQL Data Partners Podcast with Carlos L. Chacon”) that SQL has an “S-shaped” learning curve, where you learn a lot quickly in the beginning, and then it gets harder.

Because I want to be as sensitive as I can be regarding server performance from the beginning, I would like to try to learn things with an eye toward not screwing things up for our DBAs. I’ve already slowed down the server enough to have had my boss called regarding one of my queries recently, and I don’t want that to happen again.

Queries inspired by MVA Database Fundamentals course

I’m currently working through Microsoft Virtual Academy’s Database Fundamentals course. I really appreciate how likable these guys are. In fact, I am noticing this as a trend in many of the SQL resources I have been looking into – there are a lot of really relatable, helpful people out their teaching about SQL.  I hope this trend continues!

Anyway, they showed how to make a UNION statement, and in their example they used two select queries pulling results from the same table. I might be wrong, but I think that same query could be done by just making a more complex WHERE clause. However, they said in the video that you could union data from two separate tables if the columns were the same. To test that theory, I first made a duplicate of the

20160612 select into new table and union

The results appeared to come back correct. I was having difficulty yesterday with SQL Server telling me the objectID HumanResources.Employee2 wasn’t valid (I didn’t save the exact error, but it showed up if I hovered over my query), but that has cleared up today. (Because of reboot?) I had planned to continue trying to figure this out, and wondered if it had to do with a lack of primary key or some other strangeness.

Anyway, it was slightly exciting to create my first table using a SELECT / INTO statement, and also to use the UNION clause for the first time. The UNION query is almost identical to the one from the Database Fundamentals class, except for the second “FROM” being the Employee2 table instead of the Employee table.

I think it’s important to know you can create a table using the SELECT/INTO in this way, because when I very first read about this type of query, I understood it to be more like Microsoft Access’ Append query. I think it is like that, except it will create the table if the table doesn’t already exist. That’s pretty cool! That leaves the “CREATE TABLE” query for creating a blank table structure, which is definitely different from Microsoft Access.

Windows 10 turned off my SQL Server Service

Good morning. 🙂

Brief introductions: I’m Natalie, and I’m at the very beginning stages of learning SQL, and I’ve been contemplating starting a blog just to document the stuff I’m trying to soak in.

Long story short: I come from a Microsoft Access background, and my employer has impressed upon me that I must learn T-SQL querying in order to keep my job. I had been living without a computer outside of my iphone for almost a year, but last week I bit the bullet and bought a new Windows 10 laptop.

I installed SQL Server 2016 Express on it. (We are using 2014 at work, but I just couldn’t help myself.) I loaded the 2014 version of AdventureWorks on it, and was following along with Microsoft Virtual Academy’s Database Fundamentals course, re-creating the queries as I watched the material. All was well as of last night.

I fell asleep around midnight after updating the comments in a few of my practice queries, and when I woke up my laptop had gone through an update. I’m new to Windows 10, so I’m sure this isn’t exactly unique insight, but I found it slightly amusing and slightly threatening how Windows slowly let me back into my machine, reassuring me that my files were all still in the same place but that now my computer would work even better. I remember thinking that the underlying message seemed to be, “Don’t freak out!”

All seemed well – the Word document I had open the night before had auto-saved and was recovered. I went to SQL Server, chose my local server, and hit “Connect.” This is what I got back:

20160612 screenshot of error
Failed to connect to server x\SQLEXPRESS. (Microsoft.SqlServer.ConnectionInfo) Additional Information: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 – Error Location Server/Instance Specified) (Microsoft SQL Server)

So, what the heck, Microsoft? It worked fine last night. How can I not connect to the data on my own hard drive?

Google got me this far: “Was facing the same issue. Realized that Windows 10 had turned off the SQL Server Service. To resolve, open ‘services.msc’ and start service of SQL Server ()””

I searched for “Services” from my start menu and opened them up. Here are the SQL Server ones:

20160612 screenshot of services

It may be ham-fisted of me, but I changed the disabled ones to automatic without delayed start. I don’t know which one was causing the issue. (I did this by right-clicking, going to properties, and choosing this from the drop-down.)

Thankfully this did resolve the issue, and I was able to continue with my studies.

I would love feedback if someone has more info on this – hopefully I haven’t opened up a wormhole. 😀