Database Management: SQL Select

Please note that the following blog post provides a summary view for what you need to get done (left column) and quick examples that illustrate how to do it in SQL (right column with SQL code in red). For more information please see the resources below:

Select
SELECT *
  FROM dept
Parentheses
SELECT ename, sal, 12*(sal+100)
  FROM emp
Column Alias
SELECT ename, AS name, sal salary
  FROM emp
Eliminating Duplicate Rows
SELECT DISTINCT deptno
  FROM emp
Selecting with WHERE
SELECT ename, job, deptno
  FROM emp
  WHERE job=’Clerk’
Comparison Operators in WHERE
SELECT ename, sal, comm
  FROM emp
  WHERE sal<=comm
Between Operators in WHERE
SELECT ename, sal, comm
  FROM emp
  WHERE sal BETWEEN 1000 AND 1500
Test Values in a list
SELECT ename, empno, mgr
  FROM emp
  WHERE mgr IN (7902, 7566, 7788)
Pattern Matching in WHERE

_ is one space character *

% is trailing characters *

* is all characters

SELECT ename
  FROM emp
  WHERE ename LIKE ‘_A%’
Both conditions in WHERE are true
SELECT ename, empno, job, sal
  FROM emp
  WHERE sal>=1100
  AND job=’Clerk’
One or the other in WHERE are true
SELECT ename, empno, job, sal
  FROM emp
  WHERE sal>=1100
  OR job=’Clerk’
Not Operator in WHERE
SELECT ename, empno, job, sal
  FROM emp
  WHERE job NOT IN (’Clerk’, ’Manager’, ’Analyst,)
Sort rows
SELECT ename, job, deptno, hiredate
  FROM emp
  ORDER BY hiredate
Descending order sorting of rows
SELECT ename, job, deptno, hiredate
  FROM emp
  ORDER BY hiredate DESC
Table aliases
SELECT e.empno, e.ename, e.deptno,  d.deptno, d.loc
  FROM emp e, dept d
  WHERE e.deptno = d.deptno

 

Database Management: SQL Basics

Please note that the following blog post provides a summary view for what you need to get done (left column) and quick examples that illustrate how to do it in SQL (right column with SQL code in red). For more information please see the resources below:

Create table
CREATE TABLE dept
  (deptno NUMERIC(2),
   dname VARCHAR(14),
   loc VARCHAR(14))
Add Column
ALTER TABLE dept
  ADD job VARCHAR (9);
Rename Table
RENAME dept TO department
Rename Column
ALTER TABLE dept ALTER COLUMN job RENAME TO career
Delete Table
DROP TABLE dept
Not Null in a table (forcing there to be a value in that variable in the table)
CREATE TABLE dept
  (deptno NUMERIC(2) NOT NULL,
   dname VARCHAR(14),
   loc VARCHAR(14))
Unique (allows for null values, but doesn’t allow for the same variable to be repeated)
CREATE TABLE dept
  (deptno NUMERIC(2),
   dname VARCHAR(14),
   loc VARCHAR(14),
   CONSTRAINT dept_dname_uk UNIQUE (dname))
Primary Key (doesn’t allow for null values)
CREATE TABLE dept
  (deptno NUMERIC(2),
    dname VARCHAR(14),
    loc VARCHAR(14),
    CONSTRAINT dept_dname_uk UNIQUE (dname),
    CONSTRAINT dept_deptno_pk PRIMARY KEY(deptno))
Foreign Keys (connect data with other tables)
CREATE TABLE emp
  (empno NUMERIC(4),
   ename VARCHAR(10) NOT NULL,
   job VARCHAR(9),
   mgr NUMERIC(4),
   hiredate DATETIME,
   sal MONEY,
   comm MONEY,
   deptno NUMERIC(2) NOT NULL,
   CONSTRAINT emp_deptno_fk FOREIGN KEY (deptno) REFERENCES dept (deptno))
)
Check Constraints (sanity checks)
CREATE TABLE Dept
  (deptno NUMERIC(2),
    dname VARCHAR(14),
    loc VARCHAR(14),
    CONSTRAINT dept_dname_uk UNIQUE (dname),
    CONSTRAINT dept_deptno_pk PRIMARY KEY(deptno),
    CONSTRAINT emp_deptno_ck CHECK (deptno BETWEEN 10 AND 99))
Insert new rows of data
INSERT INTO dept (deptno, dname, loc)
  VALUES (50, ‘development’, ‘Detroit’)
Change data within the table
UPDATE emp
  SET deptno = 20
  WHERE empno = 7782
Removing a row form a table
DELETE FROM dept
  WHERE dname = ‘development’
Removing data entry in all tables (CAUTION)
DELETE CASCADE FROM dept
  WHERE dname = ‘development’

NOTE:

  • When doing delete stuff, do a SELECT statement with the WHERE clause to make sure you are not shooting in the dark.

 

Data Tools: Artificial Intelligence

Big data Analytics and Artificial Intelligence

Artificial Intelligence (AI) is an embedded technology, based off of the current infrastructure (i.e. supercomputers), big data, and machine learning algorithms (Cyranoski, 2015; Power, 2015). Though previously, AI wasn’t able to come into existence without the proper computational power that is provided today (Cringely, 2013).  AI can make use of data hidden in “dark wells” and silos, where the end-user had no idea that the data even existed, to begin with (Power, 2015).  The goal of AI is to use huge amounts of data to draw out a set of rules through machine learning that will effectively replace experts in a certain field (Cringely, 2013; Power, 2015). Cringely (2013) stated that in some situations big data can eliminate the need for theory and that AI can aid in analyzing big data where theory is either lacking or impossible to define.

AI can provide tremendous value since it builds thousands of models and correlations automatically in one week, which use to take a few quantitative data scientist years to do (Dewey, 2013; Power, 2015).  The thing that has slowed down the progression of AI in the past was the creation of human readable computer languages like XML or SQL, which is not intuitive for computers to read (Cringely, 2013).  Fortunately, AI can easily use structured data and now use unstructured data thanks to everyone who tags all these unstructured data either in comments or on the data point itself, speeding up the computational time (Cringely, 2013; Power, 2015).  Dewey (2013), hypothesized that not only will AI be able to analyze big data at speeds faster than any human can, but that the AI system can also begin to improve its search algorithms in phenomena called intelligence explosion.  Intelligence explosion is when an AI system begins to analyze itself to improve itself in an iterative process to a point where there is an exponential growth in improvement (Dewey, 2013).

Unfortunately, the rules created by AI out of 50K variables lack substantive human meaning, or the “Why” behind it, thus making it hard to interpret the results (Power, 2015).  It would take many scientists to analyze the same big data and analyze it all, to fully understand how the connections were made in the AI system, which is no longer feasible (Cringely, 2013).  It is as if data scientist is trying to read the mind of the AI system, and they currently cannot read a human’s mind. However, the results of AI are becoming accurate, with AI identifying cats in photographs in 72 hours of machine learning and after a cat is tagged in a few photographs (Cringely, 2013). AI could be applied to any field of study like finance, social science, science, engineering, etc. or even play against champions on the Jeopardy game show (Cyranoski, 2015; Cringely, 2013; Dewey, 2013; Power, 2015).

Example of artificial intelligence use in big data analysis: Genomics

The goal of AI use on genomic data is to help analyze physiological traits and lifestyle choices to provide a dedicated and personalized health plan to treat and eventually prevent disease (Cyranoski, 2015; Power, 2015).  This is done by feeding the AI systems with huge amounts of genomic data, which is considered big data by today’s standards (Cyranoski, 2015). Systems like IBM’s Watson (an AI system) could provide treatment options based on the results gained from analyzing thousands or even millions of genomic data (Power, 2015).  This is done by analyzing all this data and allowing machine learning techniques to devise algorithms based on the input data (Cringely, 2013; Cyranoski, 2015; Power, 2015).  As of 2015, there is about 100,000 individual genomic data in the system, and even with this huge amounts of data, it is still not enough to provide the personalized health plan that is currently being envisioned based on a person’s genomic data (Cyranoski, 2015).  Eventually, millions of individuals will need to be added into the AI system, and not just genomic data, but also proteomics, metabolomics, lipidomics, etc.

Resources: