Big Data Analytics: Career Prospects

There is a wealth of employment opportunities for professionals with knowledge of big data analytics. This post explores what employment opportunities currently exist for people graduating with a concentration in data analytics.

Advertisements

Masters and Doctoral graduates have some advantages over Undergraduates, because they have done research or capstones involving big datasets, they can explain the motivations and reasoning behind the work (chapter 1 & 2 of the dissertation), they can learn and adapt quickly (chapter 3 reflects what you have learned and how you will apply it), and they can think critically about problems (chapter 4 & 5 of the dissertation).  Doctoral student, work on a problem for multiple months/years to see a solution (filling in a gap in the knowledge) that they couldn’t dream of seeing as incomplete (or unfillable).  But, to prepare best for a data science position or big data position, the doctoral shouldn’t be purely theoretical, and should contain an analysis of huge datasets.  Based on my personal analysis, I have noticed that when applying for a senior level position or a team lead position in data science, a doctorate gives you an additional three years of experience on top of what you have already.  Whereas if you lack a doctorate, you need a Master’s degree and three years of experience to be considered for that senior level position or a team lead position in data science.

Master levels courses in big data help build a strong mathematical, statistical, computational, and programming skills. Doctorate level courses help you learn and push the limits of knowledge in all these above mentioned fields, but also aid in becoming a domain expert in a particular area in data science.  Commanding that domain expertise, which is what you get through going through a doctoral program, can make you more valuable in the job market (Lo, n.d.).  Being more valuable in the job market can allow you to demand more in compensation.  Multiple sources of can quote multiple ranges for salaries, mostly because, this field has yet to be standardized (Lo, n.d.).  Thus, I would only provide two sources for salary ranges.

According to Columbus (2014), jobs that involve big data could include Big Data Solution Architect, Linux Systems and Big Data Engineer, Big Data Platform Engineer, Lead Software Engineer, Big Data (Java, Hadoop, SQL) have the following salary statistics:

  • Q1: $84,650
  • Median: $103,000
  • Q3: $121,300

Columbus (2014) also stated that it is very difficult to find the right people for an open requisite and that most requisites remain open for 47 days.  According to Columbus (2014), the most wanted skills for analytics data jobs based on of requisition postings in the field are: in Python (96.90% growth in demand in the past year), Linux and Hadoop (with 76% growth in demand, each).

Lo (n.d.) states that individuals with just a BS or MS degree and no full-time work experience should expect $50-75K whereas data scientist with experience can command up from $65-110K.

  • Data scientist can earn $85-170K
  • Data science/analytics managers can earn $90-140K for 1-3 direct reports
  • Data science/analytics managers can earn $130-175K for 4-9 direct reports
  • Data science/analytics managers can earn $160-240K for 10+ direct reports
  • Database Administrators can earn $50-120K, which varies upwards per more experience
  • Junior Big data engineers can earn $79-115K
  • Domain Expert Big data engineers can earn $100-165K

One way to look for opportunities in the field that are currently available is looking into the Gartner’s Magic Quadrant for Business Intelligence and Analytics Platforms (Parenteau et al., 2016). If you want to help push a tool into a higher ease of execution and completeness of vision as a data scientist consider employment in: Pyramid Analytics, Yellowfin, Platfora, Datawatch, Information Builders, Sisense, Board International, Salesforce, GoodData, Domo, Birst, SAS, Alteryx, SAP, MicroStrategy, Logi Analytics, IBM, ClearStory Data, Pentaho, TIBCO Software, BeyondCore, Qlik, Microsoft, and Tableau.  That is one way to look at this data.  Another way to look at this data is to see which tools are the best in the field and (Tableau, Qlik, Microsoft, with SAS Birst, Alterxyx, and SAP following behind) and learn those tools to to become more marketable.

Resources

Big Data Analytics: POTUS Report

This has become a data-centric society, relying on real-time data and technology (i.e., cell phone, shopping online, social networking) more than ever. Although there are many advantages associated with the use of this data, there are concerns that the collection of massive amounts of data can lead to an invasion of privacy. In January, 2014, President Obama asked his staff to take the next 90 days to prepare a report for him on how big data is affecting people’s privacy. This post revolves around this report.

The aims of big data analytics are for data scientist to fuse data from various data sources, various data types, and in huge amounts so that the data scientist could find relationships, identify patterns, and find anomalies.  Big data analytics can help provide either a descriptive, prescriptive, or predictive result to a specific research question.  Big data analytics isn’t perfect, and sometimes the results are not significant, and we must realize that correlation is not causation.  Regardless, there are a ton of benefits from big data analytics, and this is a field where policy has yet to catch up to the field to protect the nation from potential downsides while still promoting and maximizing benefits.

Policies for maximizing benefits while minimizing risk in public and private sector

In the private sector, companies can create detailed personal profiles will enable personalized services from a company to a consumer.  Interpreting personal profile data would allow a company to retain and command more of the market share, but it can also leave room for discrimination in pricing, services quality/type, and opportunities through “filter bubbles” (Podesta, Pritzker, Moniz, Holdren, & Zients, 2014).  Policy recommendation should help to encourage de-identifying personally identifiable information to a point that it would not lead to re-identification of the data. Current policies for the private sector for promoting privacy are (Podesta, et al., 2014):

  • Fair Credit Reporting Act, helps to promote fairness and privacy of credit and insurance information
  • Health insurance Portability and Accountably Act enables people to understand and control how personal health data is used
  • Gramm-Leach-Bliley Act, helps consumers of financial services have privacy
  • Children’s Online Privacy Protection Act minimizes the collection/use of children data under the age of 13
  • Consumer Privacy bill of rights is a privacy blueprint that aids in allowing people to understand what their personal data is being collected and used for that are consistent with their expectation.

In the public sector, we run into issues, when the government has collected information about their citizens for one purpose, to eventually, use that same citizen data for a different purpose (Podesta, et al., 2014).  This has the potential of the government to exert power eventually over certain types of citizens and tamper civil rights progress in the future.  Current policies in the public sector are (Podesta, et al., 2014):

  • The Affordable Care Act allows for building a better health care system from a “fee-for-service” program to a “fee-for-better-outcomes.” This has allowed for the use of big data analytics to promote preventative care rather than emergency care while reducing the use of that data to eliminate health care coverage for “pre-existing health conditions.”
  • The Family Education Rights and Privacy Act, the Protection of Pupil Rights Amendment and the Children’s Online Privacy Act help seal children educational records to prevent misuse of that data.

Identifying opportunities for big data in the economy, health, education, safety, energy-efficiency

In the economy, the use of the internet of things to equip parts of product with sensors to help monitor and transmit live, thousands of data points for sending alerts.  These alerts can tell us when maintenance is needed, for which part and where it is located, making the entire process save time and improving overall safety(Podesta, et al., 2014).

In medicine, the use of predictive analytics could be used to identify instances of insurance fraud, waste, and abuse, in real time saving more than $115M per year (Podesta, et al., 2014).  Another instance of using big data is for studying neonatal intensive care, to help use current data to create prescriptive results to determine which newborns are likely to come into contact with which infection and what would that outcome be (Podesta, et al., 2014).  Monitoring newborn’s heart rate and temperature along with other health indicators can alert doctors of an onset of an infection, to prevent it from getting out of hand. Huge amounts of genetic data sets are helping locate genetic variant to certain types of genetic diseases that were once hidden in our genetic code (Podesta, et al., 2014).

With regards to national safety and foreign interests, data scientist and data visualizers have been using data gathered by the military, to help commanders solve real operational challenges in the battlefield (Podesta, et al., 2014).  Using big data analytics on satellite data, surveillance data, and traffic flow data through roads, are making it easier to detect, obtain, and properly dispose of improvised explosive devices (IEDs).  The Department of Homeland Security is aiming to use big data analytics to identify threats as they enter the country and people of higher than the normal probability to conduct acts of violence within the country (Podesta, et al., 2014). Another safety-related used of big data analytics is the identification of human trafficking networks through analyzing the “deep web” (Podesta, et al., 2014).

Finally for energy-efficiency, understanding weather patterns and climate change, can help us understand our contribution to climate change based on our use of energy and natural resources. Analyzing traffic data, we can help improve energy efficiency and public safety in our current lighting infrastructure by dimming lights at appropriate times (Podesta, et al., 2014).  Energy efficiencies can be maximized within companies using big data analytics to control their direct, and indirect energy uses (through maximizing supply chains and monitoring equipment).  Another way we are moving to a more energy efficient future is when the government is partnering with the electric utility companies to provide businesses and families access to their personal energy usage in an easy to digest manner to allow people and companies make changes in their current consumption levels (Podesta, et al., 2014).

Protecting your own privacy outside of policy recommendation

In this report it is suggested that we can control our own privacy through using the browse in private function in most current internet browsers, this would help prevent the collection of personal data (Podesta, et al., 2014). But, this private browsing varies from internet browser to internet browser.  For important information like being denied employment, credit or insurance, consumers should be empowered to know why they were denied and should ask for that information (Podesta, et al., 2014).  Find out the reason why can allow people to address those issues in order to persevere in the future.  We can encrypt our communications as well, in order to protect our privacy, with the highest bit protection available.  We need to educate ourselves on how we should protect our personal data, digital literacy, and know how big data could be used and abused (Podesta, et al., 2014).  While we wait for currently policies to catch up with the time, we actually have more power on our own data and privacy than we know.

 

Reference:

Podesta, J., Pritzker, P., Moniz, E. J., Holdren, J. & Zients,  J. (2014). Big Data: Seizing Opportunities, Preserving Values.  Executive Office of the President. Retrieved from https://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_may_1_2014.pdf