Skip to content

Instantly share code, notes, and snippets.

@James-Rocker
Last active April 10, 2022 14:20
Show Gist options
  • Select an option

  • Save James-Rocker/174090010f2764b8c31ee30841c2b1e8 to your computer and use it in GitHub Desktop.

Select an option

Save James-Rocker/174090010f2764b8c31ee30841c2b1e8 to your computer and use it in GitHub Desktop.
Data Analyst Career Paths
SQL Developer. Good money. Can work full time or as your own consultant business. Writes SQL code.
BI Developer. Better money. Can work full time or as your own consultant business. Writes whatever code is needed to produce the required report. Can get better money by also advising on what are good and bad metrics for a dashboard, but not advising on what to the company should do with the numbers you present.
Report writer would be lower pay and just work as an employee. You are given the data and the required output and you just make the dashboard.
Maybe an ETL / ELT developer. Good to better money. Can work full time or as your own consultant business. Moving data from one system to another. Either one time moves, ongoing scheduled jobs, or real time. Lots of SQL code, but also using other tools to transform and move data.
Data Engineer. Top pay. Requires good understanding of many database systems, data tools, and programming languages. Not necessarily being fluent in anything, but basic knowledge on how to use them, and pros/cons to picking tools.
Data Architect. Top pay. Requires good knowledge of the internals of one/many database systems (depending on what they need) and how to properly lay out tables, fields, datatypes, and scale performance.
BI Analyst would be close to a Data Analyst. Trying to figure out how to save the company money, the effects of campaigns, insights on consumer desires, finding new markets...
Data Scientist. Top pay. Having the insight/intuition to design new systems to capture the data that the data analyst uses. Very high level data analysis.
@James-Rocker
Copy link
Author

James-Rocker commented Apr 9, 2022

It's kinda silly and meme-y but this article is a good intro to what you need to learn. https://reneelin2019.medium.com/data-analyst-roadmap-2022-ad95c6199b32

Ultimately it boils down to

  1. Collecting data - understanding where the data comes from, what creates it, housing the data for you to work on
  2. Processing it - data can be in strange formats, have errors, you might need to split columns etc. Make it easy to work with later.
  3. Analysis/insights - might be simple questions like how many units sold last month or something more complex like analysis on customers
  4. Visualizing for stakeholders (stakeholders usually whoever has asked you for the insights) - display your results, the best way to do this is to tell a story with your data. You are showing this to people who might only have a minute to understand the issue and make a decision

@James-Rocker
Copy link
Author

James-Rocker commented Apr 9, 2022

Good books

Collecting data

As you will probably just need some SQL, try https://www.amazon.co.uk/dp/1492057614

For more general data mining

https://www.amazon.co.uk/Data-Mining-Concepts-Techniques-Management/dp/0123814790

However, it's unlikely as an analyst you will need to know much in terms of collections. Most of the time as long as you can extract data from an excel report, a database or a report then put the data into something you can process you shouldn't need to know a lot more. The most you should need to do is a bit of SQL. Another with more examples:
https://www.amazon.co.uk/SQL-Data-Analytics-efficient-analysis/dp/1789807352

Processing data

https://www.amazon.co.uk/Python-Data-Analysis-Wes-Mckinney/dp/1491957662

Analysis

https://www.amazon.co.uk/Python-Data-Analysis-Wes-Mckinney/dp/1491957662 - covers both processing and analysis
http://www.amazon.co.uk/Python-Everybody-Exploring-Data/dp/1530051126

Visualization

https://www.amazon.co.uk/Storytelling-Data-Visualization-Business-Professionals/dp/1119002257

Python does have some visualization tools but IMO, it's better to use a real visualization tool for presenting to clients. Knowing different analytic tools is useful but the most important thing here is making sure your data informs the client about the analysis you have found. This is way harder than learning any particular tool and will always apply.

@James-Rocker
Copy link
Author

local database learning

Learning SQL can be a pain because running it locally can be kind of annoying. There are a series of reasons why it might be annoying such as installing SQL server software or it's only usable from the command line. IMO, learn the syntax first through like an online course then try installing a database locally to play around with data

If you want something to run locally, postgresql is free and comes packaged with a good editor for reading data easily (plus it connects well with python) https://www.postgresql.org/download/windows/.

From here, try loading a data set and start querying it. Google has a great data set search tool for anything you are interested in querying https://datasetsearch.research.google.com/

@James-Rocker
Copy link
Author

If you want to understand databases better, there's a really good blog post https://www.holistics.io/blog/how-to-read-data-warehouse-toolkit/ as most database books are really dry and pointlessly boring. This will be useful for data collection so you know how to tie data together

@James-Rocker
Copy link
Author

James-Rocker commented Apr 9, 2022

For learning online, there are loads of online editors which are great starting points (but do protect you from some of the real world difficulties with the tech and languages). I'd recommend you the following

SQL - https://sqlzoo.net/wiki/SQL_Tutorial
Python fundamentals - https://www.codecademy.com/learn/learn-python-3
Data analysis - https://www.datacamp.com/ (not free but it is very good)

@James-Rocker
Copy link
Author

Excel is useful for a lot of this and it is a great intro to analytics and it's always available (which has been a problem in the past for me).
However, it has a whole bunch of problems including; slow, has problems with different devs having different version of a file, limited number of rows, bad data formatting and more. However, if you want to learn more

https://www.excel-easy.com/data-analysis.html

@James-Rocker
Copy link
Author

Get a copy of pycharm https://www.jetbrains.com/pycharm/download/#section=windows

Download a copy of python 3 - https://www.python.org/downloads/

You might not have to do that with pycharm but it's worth doing anyway

For in-browser python coding/open analysis, try jupyter notebooks https://jupyter.org/

@James-Rocker
Copy link
Author

Useful libraries

Pandas - processing data
SQLalchemy - collecting from SQL
seaborn - data visualization https://seaborn.pydata.org/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment