Tuesday, September 19, 2017

The Future of Big Data: 10 Predictions You Should Be Aware Of

Come 2020, every person in the world will be creating 7 MBs of data every second. We have already created more data in past couple of years than in the entire history of human kind. Big data has taken the world by storm and there are no signs of slowing down. You might be thinking, “Where would big data industry go from here?” Here are 10 big data predictions that will answer your intriguing question.

1. Machine Learning Will Be the Next Big Thing in Big Data

One of the hottest technology trends today is machine learning and it will play a big part in the future of big data as well. According to Ovum, Machine learning will be at the forefront of the big data revolution. It will help businesses in preparing data and conduct predictive analysis so that businesses can overcome future challenges easily.

2. Privacy Will Be the Biggest Challenge

Whether it is the internet of things or big data, the biggest challenge for emerging technologies has been security and privacy of data. The volume of data we are creating right now and the volume of data that will be created in the future will make privacy even more important as stakes will be much higher. According to Gartner, more than 50% of business ethics violation by 2018 will be data related. Data security and privacy concerns will be the biggest hurdle for big data industry and if it fails to cope with it in an effective manner, we will see a long list of technology trends that became a fad very quickly.

3. Chief Data Officer: A New Position Will Emerge

You might be familiar with Chief Executive Officer (CEO), Chief Marketing Officer (CMO) and Chief Information Officer (CIO) but have you ever heard about Chief Data Officer (CDO)? If your answer is no, do not worry because you will soon come to know about it. According to Forrester, we will see the emergence of chief data officer as the new position and businesses will appoint chief data officers. Although, the appointment of chief data officer solely depend on the type of business and its data needs but the wider adoption of big data technologies across enterprises, hiring a chief data officer will become the norm.

4. Data Scientists Will Be In High Demand

If you are still not quite sure about which career path to choose then, there is no better time to start your career in data sciences. As the volume of data grows and big data grows bigger, demand for data scientists, analysts and data management experts will shoot up.  The gap between the demand for data professionals and the availability will widen. This will help data scientists and analysts draw higher salaries. What are you waiting for? Dive into the world of data sciences and have a brighter future.

5. Businesses Will Buy Algorithms, Instead of Software

We will see a 360-degree shift in business approach towards software. More and more businesses will look to purchase algorithm instead of creating their own. After buying an algorithm, businesses can add their own data to it. It provides businesses with more customization options as compared to when they are buying software. You cannot tweak software according to your needs. In fact, it is the other way around. Your business will have to adjust according to the software processes but all this will end soon with algorithms selling services taking center stage.

6. Investments in Big Data Technologies Will Skyrocket

According to IDC analysts, “Total revenues from big data and business analytics will rise from $122 billion in 2015 to $187 billion in 2019.” Business spending on big data will surpass $57 billion dollars this year. Although, the business investments in big data might vary from industry to industry, the increase in big data spending will remain consistent overall.  Manufacturing industry will spend the most on big data technology while health care, banking, and resource industries will be the fastest to adopt.

7. More Developers Will Join the Big Data Revolution

According to statistics, there are six million developers currently working with big data and using advanced analytics. This makes up more than 33% of developers in the world. What’s even more amazing is that big data is just getting starting so will see a surge in a number of developer developing applications for big data in years to come. With the financial rewards in terms of higher salaries involved, developers will love to create applications that can play around with big data.

8. Prescriptive Analytics Will Become an Integral Part of BI Software

Gone are the days when businesses have to purchase dedicated software for everything. Today, businesses demand single software that provides all the features they need and software companies and giving them that. Business intelligence software is also following that trend and we will see prescriptive analysis capabilities added to this software in the future.
IDC predicts that half of the business analytics software will incorporate prescriptive analytics build on cognitive computing functionality. This will help businesses to make intelligent decisions at the right time. With intelligence built into the software, you can sift through large amounts of data quickly and get a competitive advantage over your competitors.

9. Big Data Will Help You Break Productivity Records

None of your future investments will deliver a higher return on your investment than if you invest in big data, especially when it comes to boosting your business productivity. To give you a better idea, let us put numbers into perspective. According to IDC, organizations that invest in this technology and attain capabilities to analyze large amounts of data quickly and extract actionable information can get an extra $430 billion in terms of productivity benefits over their competitors. Yes, you read that right, $430 billion dollars. Remember, actionable is the key word here. You need actionable information to take your productivity to new heights.

10. Big Data Will Be Replaced By Fast and Actionable Data

According to some big data experts, big data is dead. They argue that businesses do not even use a small portion of data they have access to and big does not always mean better. Sooner rather than later, big data will be replaced by fast and actionable data, which will help businesses, take the right decisions at the right time. Having tremendous amounts of data will not give you a competitive advantage over your competitors but how effectively and quickly you analyze the data and extract actionable information from it will.

Wednesday, September 6, 2017

Tài liệu hướng dẫn sử dụng Google Analytics

Đã từ lâu, Google Analytics đã đóng một vai trò quan trọng trong việc hỗ trợ quản lý các hoạt động của website và giúp các quản trị viên gia tăng hiệu quả hoạt động cho website của mình. Công cụ Analytics của Google cũng là một trong những công cụ hỗ trợ SEO rất tốt, cho phép chúng ta đo lường và gia tăng hiệu quả SEO.
Đã đến lúc chúng ta cần đưa công cụ này vào trong công việc SEO như là một trong những công cụ không thể thiếu cho tất cả các chiến dịch SEO. Để hỗ trợ bạn đọc trong việc sử dụng công cụ này một cách hiệu quả, vietmoz.net xin gửi đến bạn đọc tài liệu hướng dẫn tổng quan về cách sử dụng Google Analytics từ cơ bản đến nâng cao của tác giả Quang Huy – Bá Cường:
(Download tài liệu này tại đây)

Sau đây là một vài các ý chính có trong tài liệu này: 
  1. Tổng quan về Google Analytics (Phần 1: Giá trị cốt lõi)
  2. Hướng dẫn sử dụng giao diện Google Analytics (Phần 2: Giao diện cơ bản Google Analytics)
  3. Một vài các tính năng nâng cao của Google Analytics (Phần 3: Google Analytics nâng cao)
1. Tổng quan về Google Analytics
Google Analytisc là một công cụ phân tích các dữ liệu của website miễn phí, được cung cấp bởi Google. Các dữ liệu này sẽ xoay quanh các hành vi của người dùng trước, trong và sau khi truy cập trang.
Từ những số liệu này, Google Analytics sẽ cho bạn một cái nhìn tổng quan về việc:
  • Người dùng truy cập trang bằng cách nào
  • Mức độ và loại tương tác của người dùng trên trang
  • Cảm nhận của họ về nội dung trên trang
  • Tại sao họ lại không thực hiện mua hàng trên trang
  • Đối với các marketer, thì giải đáp được những câu hỏi này tức là bạn đã có thể biết được lý do tại sao tỷ lẹ chuyển đổi (conversion rate) thấp và cách gia tăng tỷ lệ này.
Những số liệu mà Google Analytics thu được bắt nguồn từ mã tracking code được cài đặt trên trang (mã bạn nhận được khi cài đặt Google Analytics).
Như vậy, những trang không có Analytics sẽ không thể gửi dữ liệu về server lưu trữ và xử lý số liệu của Google.

2. Hướng dẫn sử dụng giao diện Google Analytics

Để sử dụng thành thạo công cụ này trên Google Analytics, bạn cần nắm bắt được những thao tác cơ bản sau:
  • thiết lập và tùy chỉnh các mốc thời gian
  • đọc và tùy chỉnh các bảng số liệu của Google Analytics, lọc dữ liệu và thay đổi chế độ hiển thị của báo cáo trong Google Analytics như (bảng biểu, biểu đồ cột, tròn,..)
  • phân tích luồng lưu lượng truy cập đến website

3. Một vài các tính năng nâng cao của Google Analytics

  • Goal – mục tiêu: Công cụ 
    Để hỗ trợ các quản trị viên thiết lập và quản lý số các chuyển đổi trên trang, Google Analytics sử dụng công cụ Goal để tạo lập các mục tiêu trên trang, từ đó giúp các quản trị viên tính toán số chuyển đổi trên trang. (một chuyển đổi sẽ được xác nhận khi có một mục tiêu được hoàn thành bởi người dùng)
  • Kênh:
    Giúp mô phỏng và quản lý chu trình hành vi của người dùng khi truy cập website (kênh mô phỏng phiều Marketing – Marketing Funnel)
    Đây là kênh mô phỏng phiều Marketing (Marketing Funnel). Đồng thời cung cấp cho người dùng các công cụ hỗ trợ để quản lý các hành vi của người dùng trong toàn bộ kênh, ngay từ khi người dùng mới chỉ là khách vào thăm cho đến khi họ trở thành khách hàng của website.

Friday, September 1, 2017

10 Best Ways to Learn Analytics Online

You know business analytics are crucial to the success of your organization. You know you need to improve your working BI knowledge.
But you don’t know where to start.
Don’t panic.
Here are our top ten (totally objective) picks for great online courses and resources to help you master the fundamentals of business analytics and take advantage of the tremendous opportunities better business intelligence can offer.
After all, access to easy learning is a key perk of the info revolution.

1. edX Data Analysis and Statistics Courses

edX brings you courses from leading universities all over the world — including Harvard, MIT, UC Berkeley, and more. Browse through until you find the course that best matches your needs. We recommend taking a close look at Statistical Thinking for Data Science, and Analytics taught by Andrew Gelman of the Statistical Inference, Causal Inference, and Social Science blog. It’s an excellent choice if you want to learn all about the role statistics plays in data science and analytics.
  • Duration: Self-paced
  • Fee: FREE
  • Certificate: Yes; university credit also offered

2. National Tsing Hua University’s Business Analytics Using Forecasting via FutureLearn

This course is perfect for business users who want predictive analytics. If you frequently generate or read forecasts, this is the course for you. In this course, top professors teach you key predictive analytics strategies such as how to evaluate the performance of forecasting methods and how to tie forecasting analytics in with the business challenges.
  • Duration: 6 weeks, 3 hours per week
  • Fee: FREE
  • Certificate: Yes

3. Codecademy’s Learn SQL

Codecademy is an absolute must for anyone with dreams of becoming a bona fide data analyst, and a major leg up for non-tech users who want to do more with data. SQL is the structured query language used by the vast majority of databases, CRMs, and business apps. Learn SQL and you’ll know how to access and read data in almost any context. This course is incredibly practical, prompting you to run your SQL commands via an interactive interface.
  • Duration: Self-paced
  • Fee: FREE
  • Certificate: No

4. Big Data University’s Analytics, Big Data, and Data Science Courses

Tied in with UN Global Goals, Big Data University offers a ton of great data analytics learning content with an ethical edge. Choose from courses covering Big Data Fundamentals or go deeper with courses on Hadoop Programming and more. Big Data U is the best option for “unofficial” data scientists who are interested in making a career and a difference.
  • Duration: Self-paced
  • Fee: FREE
  • Certificate: No

5. Occam’s Razor Blog, Podcast, and Videos

Not ready to commit to a full course? Author and data analytics pro, Avinash Kaushik’s blog brings you timely expert insights on what’s happening in data analytics, all within a business context. With insightful use cases and an eye to the future, even analysts can learn something new here.
  • Duration: Self-paced
  • Fee: FREE
  • Certificate: No

6. Data Analysis Training and Tutorials from Lynda.com

As LinkedIn’s online learning platform, Lynda.com’s educational content is generated by practitioners, for practitioners. At the time of this post, there are 69 courses and 2,594 tutorials for data analysis training, covering a broad range of topics, including web analytics, data validation, how to use tools like Excel and SPSS Statistics, and much more. From data scientists to non-tech users, Lynda.com has something for everyone.
  • Duration: Self-paced
  • Fee: Free 10-day trial, $25 per month for basic membership, $37.50 for premium, flexible pricing options for teams of five or more
  • Certificate: For select courses only

7. Wharton’s Business Analytics Specialization via Coursera

Learn and apply business data strategies in four targeted courses: Customer Analytics, Operations Analytics, People Analytics, and Accounting Analytics. The specialization culminates with a fifth course, the Business Analytics Capstone, a hands-on project that lets you apply your new data analysis skills to real problems faced by tech giants Yahoo, Google, and Facebook. The Capstone Project was designed in partnership with Yahoo with the goal of enabling the learner to make data-driven decisions in his or her organization.
Duration: 4 weeks per course, 1-6 hours per week, approximately 5-6 months total
Fee: 7-day free trial, then $49 per month
Certificate: Yes

8. Jigsaw Academy, The Online School of Analytics

Jigsaw Academy offers a well-curated collection of courses for new and seasoned data scientists, as well as non-tech beginners. You can choose a la carte classes to brush up on a particular topic, or opt for one of their extensive course packages which offer everything you need to know within a key area of analytics, such as Data Scientist, Big Data Analyst, or Machine Learning Specialist. Each course comes with a certificate upon completion.
Duration: 1 to 27 weeks depending on course
Fee: Courses start at $75
Certificate: Yes

9. The University of British Columbia’s Course on Creating and Managing the Analytical Business Culture

Our favorite thing about this course is that it deals with some of the trickiest real-world business intelligence problems, such as getting organizational buy-in and how to train and communicate with staff to create a business culture that embraces data-driven decision-making.
Duration: 4 weeks
Fee: $725-$740 CAD, special rate of $640 CAD available to members of the Digital Analytics Association
Certificate: This course can be applied to the UBC/DAA Award of Achievement in Digital Analytics or the Certificate in Web Intelligence

10. TM Forum’s Big Data Analytics Project

With over 85,000 members from 900+ global enterprises, TM Forum is one of the leading industry associations for digital business across industries. Their Big Data Analytics program covers best practices for extracting value from your business analytics and includes use cases to help you decide how to work through your business analytics challenges.
Duration: Self-paced
Fee: Corporate membership fees vary according to revenues but start at $1,700
Certificate: Yes
Now that you have ten great options to choose from, it’s time to get started! Whether you want to improve your skills or get a jumpstart to your learning, knowledge of BI will start transforming your professional career and organization (no matter the industry) fast. Take advantage of the info revolution and stay the course.

The Fundamentals of Data Science

Two of the biggest buzzwords in our industry are “big data” and “data science”. Big Data seems to have a lot of interest right now, but Data Science is fast becoming a very hot topic.
I think there’s room to really define the science of data science – what are those fundamentals that are needed to make data science truly a science we can build upon?
What follows are such a set of fundamentals:
Fundamentals of Data Science
The easiest thing for people within the big data / analytics / data science disciplines is to say “I do data science”. However, when it comes to data science fundamentals, we need to ask the following critical questions: What really is “data”, what are we trying to do with data, and how do we apply scientific principles to achieve our goals with data?
– What is Data?
– The Goal of Data Science
– The Scientific Method
Probability and Statistics
The world is a probabilistic one, so we work with data that is probabilistic – meaning that, given a certain set of preconditions, data will appear to you in a specific way only part of the time.  To apply data science properly, one must become familiar and comfortable with probability and statistics.
– The Two Characteristics of Data
– Examples of Statistical Data
– Introduction to Probability
– Probability Distributions
– Connection with Statistical Distributions
– Statistical Properties (Mean, Mode, Median, Moments, Standard Deviation, etc.)
– Common Probability Distributions (Discrete, Binomial, Normal)
– Other Probability Distributions (Chi-Square, Poisson)
– Joint and Conditional Probabilities
– Bayes’ Rules
– Bayesian Inference
Decision Theory
This section is one of the key fundamentals of data science.  Whether applied in scientific, engineering, or business fields, we are trying to make decisions using data.  Data itself isn’t useful unless it’s telling us something, which means we’re making a decision about what it is telling us.  How do we come up with those decisions? What are the factors that go into this decision making process?  What is the best method for making decisions with data?  This section tell us…
– Hypothesis Testing
– Binary Hypothesis Test
– Likelihood Ratio and Log Likelihood Ratio
– Bayes Risk
– Neyman-Pearson Criterion
– Receiver Operating Characteristic (ROC) Curve
– M-ary Hypothesis Test
– Optimal Decision Making
Estimation Theory
Sometimes we make characterizations of data – averages, parameter estimates, etc.  Estimation from data is essentially an extension of decision making, a natural next section from Decision Theory.
– Estimation as Extension of M-ary Hypothesis Test
– Unbiased Estimation
– Minimum Mean Square Error (MMSE)
– Maximum Likelihood Estimation (MLE)
– Maximum A Posteriori Estimation (MAP)
– Kalman Filter
Coordinate Systems
To bring various data elements together into a common decision making framework, we need to know how to align the data.  Knowledge of coordinate systems and how they are used becomes important to lay a solid foundation for bringing disparate data together.
– Introduction to Coordinate Systems
– Euclidian Spaces
– Orthogonal Coordinate Systems
– Properties of Orthogonal Coordinate Systems (angle, dot product, coordinate transformations,
– Cartesian Coordinate System
– Polar Coordinate System
– Cylindrical Coordinate System
– Spherical Coordinate System
– Transformations Between Coordinate Systems
Linear Transformations
Once we understand coordinate systems, we can learn why to transform the data to get at the underlying information.  This section describe how we can transform our data into other useful data products through various types of transformations, including the popular Fourier transform.
– Introduction to Linear Transformations
– Properties of Linear Transformations
– Matrix Multiplication
– Fourier Transform
– Properties of Fourier Transforms (time-frequency relationship, shift invariance, spectral
properties, Parseval’s Theorem, Convolution Theorem, etc.)
– Discrete and Continuous Fourier Transforms
– Uncertainty Principle and Aliasing
– Wavelet and Other Transforms
Effects of Computation on Data
An often overlooked aspect of data science is the impact the algorithms we apply have on the information we are seeking to find. Merely applying algorithms and computations to create analytics and other data products has an impact on the effectiveness data-driven decision making ability.  This section take us on a journey of advanced aspects of data science.
– Mathematical Representation of Computation
– Reversible Computations (Bijective Mapping)
– Irreversible Computations
– Impulse Response Functions
– Transformation of Probability Distributions (due to addition, subtraction, multiplication,
division, arbitrary computations, etc.)
– Impacts on Decision Making
Prototype Coding / Programming
One of the key elements to data science is the willingness of practitioners to “get their hands dirty” with data.  This means being able to write programs that access, process, and visualize data in important languages in science and industry. This section takes us on a tour of these important elements.
– Introduction to Programming
– Data Types, Variables, and Functions
– Data Structures (Arrays, etc.)
– Loops, Comparisons, If-Then-Else
– Functions
– Scripting Languages vs. Compilable Langugages
– R
– Python
– C++
Graph Theory
Graphs are ways to illustrate connections between different data elements, and they are important in today’s interconnected world.
– Introduction to Graph Theory
– Undirected Graphs
– Directed Graphs
– Various Graph Data Structures
– Route and Network Problems
Key to data science is understanding the use of algorithms to compute important data-derived metrics.  Popular data manipulation algorithms are included in this section.
– Introduction to Algorithms
– Recursive Algorithms
– Serial, Parallel, and Distributed Algorithms
– Exhaustive Search
– Divide-and-Conquer (Binary Search)
– Gradient Search
– Sorting Algorithms
– Linear Programming
– Greedy Algorithms
– Heuristic Algorithms
– Randomized Algorithms
– Shortest Path Algorithms for Graphs
Machine Learning
No data science fundamentals course would be complete without exposure to machine learning.  However, it’s important to know that these techniques build upon the fundamentals described in previous sections.  This section gives practitioners an understanding of useful and popular machine learning techniques and why they are applied.
– Introduction to Machine Learning
– Linear Classifiers (Logistic Regression, Naive Bayes Classifier, Support Vector Machines)
– Decision Trees (Random Forests)
– Bayesian Networks
– Hidden Markov Models
– Expectation-Maximization
– Artificial Neural Networks and Deep Learning
– Vector Quantization
– K-Means Clustering
Source: https://www.linkedin.com/pulse/fundamentals-data-science-mic-farris

Top 12 interesting careers to explore in Big Data


    These people use their analytical and technical capabilities to extract meaningful insights from data.
    Salary: $65,000 - $110,000
    They ensure uninterrupted flow of data between servers and applications and are also responsible for data architecture.
    Salary: $60,0945 - $124,635
    Big Data Engineers build the designs created by solutions architects. They develop, maintain, test and evaluate big data solutions within organizations.
    Salary: $100,000 - $165,000
    They work in the research and development of algorithms that are used in adaptive systems. They build methods for predicting product suggestions and demand forecasting, and explore Big Data to automatically extract patterns.
    Salary: $78,857 - $124,597
    A business analytics specialist supports various development initiatives, assists in testing activities and in the development of test scripts, performing research in order to understand business issues, and developing practical cost-effective solutions to problems.
    SALARY: $50,861 - $94,209
    They design, develop and provide production support of interactive data visualizations used across the enterprise. They possess an artistic mind that conceptualizes, design, and develop reusable graphic/data visualizations and uses strong technical knowledge for implementing these visualizations using the latest technologies.
    SALARY: $108,000 - $130,000
    They have data analysis expertise and the experience of setting up reporting tools, querying and maintaining data warehouses. They are hands-on with big data and take a data driven approach to solving complex problems.
    SALARY: $96,710 - $138,591
    They come up with solutions quickly to help businesses in making time sensitive decisions, have strong communication & analytical skills, passion for data visualization, and a drive for excellence and self-motivation.
    SALARY: $107,000 - $162,000
    They are responsible for supporting an enterprise wide business intelligence framework. This position requires critical thinking, attention to detail, and effective communication skills.
    SALARY: $77,969 - $128,337
    An analytics manager is responsible for configuration, design, implementation, and support of data analysis solution or BI tool. They are specifically required to analyze huge quantities of information gathered through transactional activity.
    SALARY: $83,910 - $134,943
    Machine Learning engineer’s final “output” is the working software, and their “audience” for this output consists of other software components that run autonomously with minimal human supervision. The decisions are made by machines and they affect how a product or service behaves.
    SALARY: $96,710 - $138,591
    They gather numerical data and then display it, and help companies to make sense of quantitative data and to spot trends and make predictions.
    SALARY:  $57,000 - $80,110


  • Hadoop
  • SAS
  • Excel
  • R
  • MongoDB
  • Python
  • Pandas
  • Apache Spark & Scala
  • Apache Storm
  • Apache Cassandra
  • MapReduce
  • Cloudera
  • HBase
  • Pig
  • Flume
  • Hive
  • Zookeeper