{"id":329340,"date":"2023-08-25T13:08:23","date_gmt":"2023-08-25T13:08:23","guid":{"rendered":"http:\/\/itteacheritfreelance.hk\/wordpress\/?guid=94bcf980aecb36825a45d4482293d204"},"modified":"2023-08-25T13:08:23","modified_gmt":"2023-08-25T13:08:23","slug":"getting-started-in-data-science-europython-2023-follow-up","status":"publish","type":"post","link":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/2023\/08\/25\/getting-started-in-data-science-europython-2023-follow-up\/","title":{"rendered":"Getting Started in Data Science: EuroPython 2023 Follow-Up"},"content":{"rendered":"<p class=\"syndicated-attribution\"><meta name= \\\"keywords \\\" content= \\\"\u96fb\u5b50\u8a08\u7b97\u6a5f, \u6559\u80b2, IT \u96fb\u8166\u73ed,\u96fb\u8166\u88dc\u7fd2\uff0c \u96fb\u8166\u73ed\uff0c \u5bb6\u6559\uff0c \u79c1\u4eba\u8001\u5e2b\uff0c \u8cc7\u8a0a\u6280\u8853\uff0c \u7a0b\u5e8f\u8a2d\u8a08\uff0c \u96fb\u5b50\u8a08\u7b97\u6a5f\uff0c \u904a\u6232\uff0c \u860b\u679c\uff0c \u96fb\u5f71\uff0c \u8a08\u7b97\u6a5f\uff0c\u7de8\u78bc\uff0c Java\uff0c C\/C++\uff0c JavaScript\uff0c PHP\uff0c HTML\uff0c CSS\uff0c MySQL\uff0c mobile\uff0c Android\uff0c \u52d5\u6f2b\uff0c Python\uff0c teacher\uff0c \u88dc\u7fd2\uff0c \u96fb\u8166\u88dc\u7fd2 \u8cc7\u8a0a, \u7535\u5b50\u8ba1\u7b97\u673a, IT ,Game, apple, movie, Computer,student,Java,\u6559\u80b2, ,\u5b66\u751f, \u5b66\u4e60, learn, \u6559\u5b66,  Android, apple,anime, animation, \u4fe1\u606f\u6280\u672f, \u7a0b\u5e8f\u8bbe\u8ba1, \u79fb\u52a8\u7535\u8bdd, \u8cc7\u8a0a\u79d1\u6280,Game, Jeu, Juego,Call Of Duty ,\u4f7f\u547d\u53ec\u559a , \u6e38\u620f, \u7535\u5b50\u6e38\u620f,, \u591a\u4eba\u7535\u5b50\u6e38\u620f, \u7f51\u7edc\u6e38\u620f\uff0conline\uff0conline game, \u624b\u673a\u6e38\u620f, mobile \\\"><\/p>\n<p>One of my favorite parts of my job as a developer advocate is being able to help people get started in data science. I still remember when I made the transition from academia to data science almost 8 years ago, and how overwhelming it was and how much I felt like I needed to learn to even get started. I am also truly passionate about this wonderful field, and I love to help others get started in an area that is so interesting and rewarding.<\/p>\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" loading=\"lazy\" width=\"2560\" height=\"1440\" src=\"https:\/\/blog.jetbrains.com\/wp-content\/uploads\/2023\/08\/Preview-page-1280x720-2x-2.png\" alt=\"\" class=\"wp-image-383420\"\/><\/figure>\n<p>I was lucky enough to be involved in a couple of activities geared toward helping data science beginners at EuroPython this year, including the <a href=\"https:\/\/ep2023.europython.eu\/session\/humble-data\"  rel=\"noopener\">Humble Data workshop<\/a> and a <a href=\"https:\/\/ep2023.europython.eu\/session\/qa-panel-for-data-science-newbies\"  rel=\"noopener\">Q&amp;A session for data science newbies<\/a> along with <a href=\"https:\/\/ep2023.europython.eu\/speaker\/cheuk-ting-ho\"  rel=\"noopener\">Cheuk Ting Ho<\/a>, <a href=\"https:\/\/ep2023.europython.eu\/speaker\/valerio-maggio\"  rel=\"noopener\">Valerio Maggio<\/a>, and <a href=\"https:\/\/ep2023.europython.eu\/speaker\/vaibhav-srivastav\"  rel=\"noopener\">Vaibhav (VB) Srivastav<\/a>. After both of these sessions I had a lot of great conversations with people who asked about which resources helped me when I was starting, and I wanted to share the content of these conversations a bit more widely.<\/p>\n<p>Let\u2019s first recap what we covered in the Q&amp;A session, and then dive into some further resources to get you started on your data science journey.<\/p>\n<p><iframe loading=\"lazy\" width=\"560\" height=\"315\" src=\"https:\/\/www.youtube.com\/embed\/5JuNAvheGvU?si=oi-KLPSoS7Pk4ETb&amp;start=21350\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" allowfullscreen><\/iframe><\/p>\n<h2 class=\"wp-block-heading\">What we covered in the Q&amp;A session<\/h2>\n<h3 class=\"wp-block-heading\">How do you define what a data scientist is in 2023?<\/h3>\n<p>Just like when I started in 2016, data science is defined differently depending on who you talk to. However, the field has definitely gotten more complicated as it has matured, with additional roles like machine learning and MLOps engineers becoming established in the last few years.<\/p>\n<p>Despite all of the continued confusion, the core of the role remains working with data to tell a story <em>scientifically<\/em> (after all, it\u2019s in the name!). This involves applying techniques like data preparation and analysis, statistics, and visualization to answer a question that is typically somewhat complex. While machine learning has become synonymous with data science, it\u2019s not actually a core part of data science work. Some data science projects may involve machine learning, but certainly not all of them.<\/p>\n<h3 class=\"wp-block-heading\">What skills do data scientists tend to have?<\/h3>\n<p>There is a well-known <a href=\"https:\/\/subscription.packtpub.com\/book\/data\/9781785887918\/1\/ch01lvl1sec09\/the-data-science-venn-diagram\"  rel=\"noopener\">Venn diagram<\/a> that has been circulating since before I even started in data science. It depicts the field as a convergence of mathematical skills, engineering skills, and domain knowledge. When I first started out, this diagram really overwhelmed me; I felt like I needed to master all three of these to even get started!<\/p>\n<p>In reality, it is impossible to know every skill used in data science in depth. Some people will come in with more strengths in mathematics or scientific skills, others will come from a software engineering background, and they\u2019ll all pick up the remaining skills on the job. The split between data science roles also means you can play to your strengths and interests better. Those who have more experience with analysis or statistics may go for a more traditional data scientist role, while those with stronger engineering skills may gravitate toward machine learning engineering.&nbsp;<\/p>\n<p>Finally, unless you work in a tiny startup, it\u2019s unlikely you will be working alone. Data scientists tend to do the research and prototyping side of things, while engineers put the models into production. So don\u2019t worry if you\u2019re not an expert at everything \u2013 there\u2019s a place for your skills in this field!<\/p>\n<h3 class=\"wp-block-heading\">How can I start developing my skills?<\/h3>\n<p>One of the most common misconceptions about data science is that you need a PhD or some other advanced degree. However, this is just one possible path for developing the core skill set of data scientists we talked about above.<\/p>\n<p>The best way to develop this skill is just to get hold of datasets that interest you and start creating projects with them. VB in particular found the subreddit <a href=\"https:\/\/www.reddit.com\/r\/dataisbeautiful\/\"  rel=\"noopener\">r\/dataisbeautiful<\/a> helpful for getting motivation and feedback. I love writing, so I started a blog. Cheuk recommends volunteering for organizations like <a href=\"https:\/\/www.datakind.org\/\"  rel=\"noopener\">DataKind<\/a> and having a community around you. Once you have a feel for working with real data, you have one of the most important skills mastered and you\u2019ll build the rest on top of this.<\/p>\n<p>Finally, the main thing is not to panic! Just choose the tooling (language, development environment, and packages) that you like best in the beginning, and build up your skills using these. I personally loved R when I started because it was designed for people from statistics backgrounds and suited me better, but over time I switched to Python as I moved more into machine learning.<\/p>\n<h2 class=\"wp-block-heading\">Useful resources<\/h2>\n<p>To help you continue your data science journey, I\u2019m also including a list of resources I\u2019ve found useful in the past (or content I\u2019ve created to cover specific topics).<\/p>\n<h3 class=\"wp-block-heading\">Programming languages<\/h3>\n<p>Your first step will be getting some basic programming under your belt \u2013 and by basic, I really do mean basic! I\u2019d recommend starting with either R or Python. There are dozens of courses for each online, but I can recommend the two that I used: <a href=\"https:\/\/psyr.djnavarro.net\/\"  rel=\"noopener\">R for Psychological Science<\/a> and <a href=\"https:\/\/learnpythonthehardway.org\/\"  rel=\"noopener\">Learn Python the Hard Way<\/a>.<\/p>\n<p>You should also try to include SQL in your coding toolbelt. I\u2019ve found that <a href=\"https:\/\/www.w3schools.com\/sql\/\"  rel=\"noopener\">W3Schools\u2019 SQL course<\/a> is a great place to get started.<\/p>\n<h3 class=\"wp-block-heading\">Data analysis<\/h3>\n<p>Learning pandas is fundamental to getting started with data analysis in Python, and I cannot recommend Wes McKinney\u2019s book <a href=\"https:\/\/wesmckinney.com\/book\/\"  rel=\"noopener\"><em>Python for Data Analysis<\/em><\/a> highly enough. Once you\u2019ve finished with that book, you probably want to start playing with some real data. For this, I recommend two sources: the <a href=\"https:\/\/archive.ics.uci.edu\/datasets\"  rel=\"noopener\">UC Irvine Machine Learning Repository<\/a> and <a href=\"https:\/\/www.kaggle.com\/datasets\"  rel=\"noopener\">Kaggle Datasets<\/a>.<\/p>\n<p>From there, you will probably want to get into data visualization. For R, the gold standard for graphing is <a href=\"https:\/\/ggplot2.tidyverse.org\/\"  rel=\"noopener\">ggplot2<\/a>, but there is more diversity in Python plotting packages, which include <a href=\"https:\/\/matplotlib.org\/\"  rel=\"noopener\">Matplotlib<\/a>, <a href=\"https:\/\/seaborn.pydata.org\/\"  rel=\"noopener\">seaborn<\/a>, <a href=\"https:\/\/plotly.com\/\"  rel=\"noopener\">plotly<\/a>, <a href=\"https:\/\/lets-plot.org\/\"  rel=\"noopener\">lets-plot<\/a>, <a href=\"https:\/\/plotnine.readthedocs.io\/en\/stable\/\"  rel=\"noopener\">plotnine<\/a>, and more. I think the best way to get started with plotting is just to think about what you want to show (maybe check out <a href=\"https:\/\/www.reddit.com\/r\/dataisbeautiful\/\"  rel=\"noopener\">r\/dataisbeautiful<\/a> for inspiration) and start messing around with a plotting package that you like.<\/p>\n<p>Once you want to start covering data cleaning and issues, you may want to pick up another book or course to cover this. I have a <a href=\"https:\/\/www.youtube.com\/watch?v=9EI_lqPUVEE&amp;ab_channel=NDCConferences\"  rel=\"noopener\">talk<\/a> where I give an overview of some of the major issues that can come up in datasets and negatively affect your data science work. Much of this talk\u2019s contents comes from one of my university statistics books, <a href=\"https:\/\/www.pearson.com\/en-us\/subject-catalog\/p\/using-multivariate-statistics\/P200000003097\/9780137526543\"  rel=\"noopener\"><em>Using Multivariate Statistics<\/em><\/a>.<\/p>\n<h3 class=\"wp-block-heading\">Statistics and machine learning<\/h3>\n<p>Once you\u2019re ready to dive into more advanced topics, you can start covering statistics and machine learning. I think these are both topics you can cover bit by bit (as they can be quite dense), so don\u2019t feel like you need to master everything before you can start working as a data scientist.<\/p>\n<p>While I learned statistics from my university textbooks (which are probably a bit too specific to psychology to recommend widely), I have heard nothing but good things about <a href=\"https:\/\/greenteapress.com\/thinkstats\/\"  rel=\"noopener\"><em>Think Stats<\/em><\/a>. In terms of machine learning, there are a few options. I personally loved <a href=\"https:\/\/www.deeplearning.ai\/courses\/machine-learning-specialization\/\"  rel=\"noopener\">Andrew Ng\u2019s <em>Machine Learning Specialization<\/em><\/a> for machine learning and <a href=\"https:\/\/www.manning.com\/books\/deep-learning-with-python\"  rel=\"noopener\">Fran\u00e7ois Chollet\u2019s <em>Deep Learning<\/em><\/a> for an introduction to deep learning. I\u2019ve also had friends who really liked both the classic <a href=\"https:\/\/www.statlearning.com\/\"  rel=\"noopener\"><em>Introduction to Statistical Learning<\/em><\/a> and <a href=\"https:\/\/developers.google.com\/machine-learning\/crash-course\"  rel=\"noopener\">Google\u2019s Machine Learning Crash Course<\/a>.<\/p>\n<h3 class=\"wp-block-heading\">Shout out to Humble Data!<\/h3>\n<p>And as a final plug \u2013 if you\u2019re looking for a way to get started but want some more support, you can also keep your eye out for the next <a href=\"https:\/\/humbledata.org\/\"  rel=\"noopener\">Humble Data<\/a> workshop! This free workshop is aimed at getting you up and running with basic Python data science, going from the basics of Python programming to working with pandas and data visualization.<\/p>\n\n<p class=\"syndicated-attribution\"><figure class= \\\"wp-block-image alignnone \\\"><img src= \\\"http:\/\/itteacheritfreelance.hk\/test\/wordpress\/wp-content\/uploads\/2016\/05\/logo2-2.png\\\" alt=\\\"IT\u96fb\u8166\u88dc\u7fd2 java\u88dc\u7fd2 \u70ba\u5927\u5bb6\u914d\u5c0d\u96fb\u8166\u88dc\u7fd2,IT freelance, \u79c1\u4eba\u8001\u5e2b, PHP\u88dc\u7fd2,CSS\u88dc\u7fd2,XML,Java\u88dc\u7fd2,MySQL\u88dc\u7fd2,graphic design\u88dc\u7fd2,\u4e2d\u5c0f\u5b78ICT\u88dc\u7fd2,\u4e00\u5c0d\u4e00\u79c1\u4eba\u88dc\u7fd2\u548cFreelance\u81ea\u7531\u5de5\u4f5c\u914d\u5c0d\u3002\\\"\/><figcaption>\u7acb\u523b\u8a3b\u518a\u53ca\u5831\u540d\u96fb\u8166\u88dc\u7fd2\u8ab2\u7a0b\u5427!<\/figcaption><\/figure>\r\n<\/br>Find A Teacher Form:\r\n<\/br>https:\/\/docs.google.com\/forms\/d\/1vREBnX5n262umf4wU5U2pyTwvk9O-JrAgblA-wH9GFQ\/viewform?edit_requested=true#responses\r\n<\/br><\/br>Email:\r\n<\/br>public1989two@gmail.com<br><br><br><br><br><br><br>\r\n<a href=www.itsec.hk style=color:#FFFFFF;>www.itsec.hk<\/a><br>\r\n<a href=\\\"www.itsec.vip\\\" style=color:#FFFFFF;>www.itsec.vip<\/a><br>\r\n<a href=\\\"www.itseceu.uk\\\" style=color:#FFFFFF;>www.itseceu.uk<\/a><br><\/p>","protected":false},"excerpt":{"rendered":"<div class=\"mh-excerpt\"><p>One of my favorite parts of my job as a developer advocate is being able to help people get started in data science. I still remember when I made the transition from academia to data science almost 8 years ago, and how overwhelming it was and how much I felt like I needed to learn [\u2026]<\/p>\n<\/div>","protected":false},"author":2035,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"slim_seo":{"title":"Getting Started in Data Science: EuroPython 2023 Follow-Up - ITTeacherITFreelance.hk","description":"One of my favorite parts of my job as a developer advocate is being able to help people get started in data science. I still remember when I made the transition"},"footnotes":""},"categories":[10700],"tags":[],"_links":{"self":[{"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/329340"}],"collection":[{"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/users\/2035"}],"replies":[{"embeddable":true,"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/comments?post=329340"}],"version-history":[{"count":1,"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/329340\/revisions"}],"predecessor-version":[{"id":329341,"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/posts\/329340\/revisions\/329341"}],"wp:attachment":[{"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/media?parent=329340"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/categories?post=329340"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/wp-json\/wp\/v2\/tags?post=329340"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}