SQL / Azure Setup

SQL Option 1 – Microsoft Azure SQL Server

The first step is to set up a database in the cloud that we can write and read to. There are a few options available, Amazon AWS, Microsoft Azure, Google Cloud Platform etc. For this, I’m choosing to setup my SQL server on an Azure platform (you can get access to $25 a month free Azure usage for a year with Visual Studio Dev Essentials). I’ve personally used all 3, and I find Azure to be a little more user friendly.

  1. You need a Microsoft/Hotmail/Live/Outlook account for this
  2. Go to portal.azure.com and add a new database resource (there’s separate tutorials on this should you need it, Azure makes it pretty straightforward though)
    • Make sure you choose a cheap pricing tier if you’re just experimenting!
  3. Download and install SQL Server Management Studio (SSMS)
  4. Log in to SSMS:
    1. Server address: found in Azure under “Overview”
    2. Username: easiest found via “Overview” > “Show Database Connection Strings” (mine was in the format username@123456)
    3. Password: the one you setup the server with
  5. You may then be asked to authenticate using your windows credentials
  6. Once logged in, expand the tree in SSMS like so:
    • server.name
      • Databases
        • database_name
          • Tables
  7. Right click on “Tables” and choose “New” > “Table”
  8. Now we add the columns for the data we want. The data we’re adding is:
  9. right click on the “hash_id” column and make it a “primary key”. This will make a small key icon appear *
  10. To the right of the SSMS, rename the table to something more representative. I’m going to rename mine “news_tracker”
  11. Click “Save”
  12. Optional (Creating a guest account):
    1. Click the top level “Databases” in the tree on the left
    2. Click “New Query”
    3. Create a new user account by running an SQL query:
    4. CREATE LOGIN publicaccess WITH PASSWORD = 'Jackanory';
    5. You should see “Command(s) completed successfully.” in the bottom panel

* In a database, you have “keys”. These are unique to each row, and should be something that would ensure 100% that row not be confused with another. For instance, if you were to create a database of books, the key would be the ISBN on the back. If you wanted a database of cars, the key would be the licence plate. I’m simplifying it, but this is the purpose of “keys”. You can have multiple keys if you wish, by using primary, secondary etc. This is what we should use, as for the data we want to collect, we will care about the source, and the date. These would be the primary and secondary keys. However, to keep things simple, we’re just going to create a hash from those two variables, and make THAT the primary key.

 

Install “pyodbc” module

  1. Install the “Microsoft ODBC Driver for SQL” – ODBC is the universal standard API for accessing a database via code
  2. Install using pip by running:
  3. pip install pyodbc==3.1.1 (correct version as of Feb 2017)
  4. What’s pip? Pip is a package manager for Python, and ships as part of Python 3.4+. For earlier versions, you need to install it…
  5. Download the latest “tar.gz” file from https://pypi.python.org/pypi/pip#downloads
  6. Extract the file using WinRAR or a similar utility
  7. Copy the contents of this directory to your Python root (e.g. C:\Python27)
  8. Download and install the exe file from https://pypi.python.org/pypi/setuptools that matches your version of Python (I’m on 2.7)
  9. Open a command prompt and navigate to your C:\PythonXX directory and run:
  10. python.exe setup.py install
  11. Keeping the command window at C:\PythonXX, run:
  12. python.exe pip install pyodbc
  13. Restart your IDE
  14. In your IDE, execute the following to ensure everything has worked OK:
  15. import pyodbc
  16. Now you’re all set up, lets try connecting to our database

Connecting to SQL With Python

SQL interactions are accomplished by creating a command and then executing it. If it’s a command that is making a change, then you must always remember to call “.commit()” to ensure the change is submitted.

  1. We will connect using the pyodbc module, so make sure you import it
  2. We will also need:
  3. Server address
  4. Database name
  5. Table name
  6. Username and password to log in
  7. Here’s a simple snippet to inject some test data into our new database:
  8. You can preview and delete this data in the database by right clicking on the table name in the tree in SSMS and choosing “Edit Top 200 Rows”
  9. The value “hash_id” is our key. If we try to run this code twice, then we’ll get an error, as the same key cannot be added more than once. To fix this, we can add a check to ensure it doesn’t exist yet:

fd

 

There are a few popular web-based visualization systems: Periscope, Chartio, Looker, Tableau. These are mostly the same, with the exception that Periscope is less “drag and drop” and more SQL query based. Tableau Desktop is free for students. Each service offers a free trial period, but some are only 14 days. So if you’re using this to prove it’s usefulness to management, then wait until you have the data in the database before signing up 🙂 You can also try each one in turn to see which fits your usage the best.

 

Further Reading:

pyodbc Wiki

Leave a Reply

Your email address will not be published. Required fields are marked *