Sunday, December 29, 2013

Luớt web 1 vòng, học 1 sàn tin: từ MicroAd Vietnam đến ý tuởng về App cho tin tức

MicroAds  là 1 platform quảng cáo của Nhật (có phòng R&D chuyên về display ads ở Japan ) có real-time bidding và bắt đầu chú ý đến thi truờng Vietnam



Circa (http://cir.ca ) là startup , có lẽ sẽ có đột phá trong 2014 ở lĩnh vực mobile cho news bằng cách mang tính trải nghiệm (UX) tốt cho nguời dùng mobile. 

Tech Stack:
  • scalable services, from web indexing and visualization to an API for the real-time delivery of content and metadata
  • conceptualizing and creating intuitive, engaging, and brand-consistent mobile experiences




Sunday, December 22, 2013

What's wrong with big data ?

it's too big 
  1. to do
  2. to understand
  3. to make core value
Solution: divide and conquer http://en.wikipedia.org/wiki/Divide_and_conquer

it's hard to do ? no, it's easier day by day



Not easy to understand ? No , follow 4 steps: 
  1. Looking what data you have ?
  2. See data in different way
  3. Imagine what you want to see
  4. Show data in simple graph that anyone can understand 






From data to core value ?
http://www.amazon.com/Too-Big-Ignore-Business-Wiley/dp/1118638174/



ARMONK, N.Y. - 17 Dec 2013: Today IBM (NYSE: IBM) unveiled the eighth annual  "IBM 5 in 5" (#ibm5in5) – a list of innovations that have the potential to change the way people work, live and interact during the next five years.
This year’s IBM 5 in 5 explores the idea that everything will learn – driven by a new era of cognitive systems where machines will learn, reason and engage with us in a more natural and personalized way. These innovations are beginning to emerge enabled by cloud computingbig data analytics and learning technologies all coming together, with the appropriate privacy and security considerations, for consumers, citizens, students and patients.

Hadoop is not final Big Data Solution ?

Yes, http://www.1010data.com/images/327/Press-Release-1010data_2013_Big_Data_in_Business_Study.pdf

Wednesday, December 18, 2013

Ideas for STAK - the framework for real-time reactive analytics

You or someone on your team is suggesting a change that just might work. But why act on a hunch when you can hold out for evidence? According to the author, the best way to support decision making on potential innovations is to...
  • Design an experiment.
Start with a hypothesis about how the change will help the business. If it’s a good one, you’ll learn as much by disproving it as you would by proving it. Put it to the test by measuring what happens in a test group versus a control group. From the outset, be clear on what you need to measure to produce a decisive result—and whether that’s a metric you even have the capability to track.
  • Act on the facts.
Nothing but a success in a testing environment should be rolled out more broadly. But neither should failures simply be scrapped. Refine the hypothesis on the basis of the results, and consider testing a variation. Most important, capture what’s been learned, and make it available to others in the organization through a “learning library,” so resources aren’t wasted proving the same thing again.
Example: Marketers at the Subway restaurant chain wanted to drum up business by putting foot-long subs on sale for only $5, but franchise owners worried that the promotion would lure existing customers away from higher-priced menu items. An experiment pitting test sites against control sites proved that the promotion would pay off—which it subsequently did.
  • Make testing the norm.
Create the training and infrastructure that will enable nonexperts in statistics to oversee rigorous experiments. Off-the-shelf software can walk them through the steps and help them analyze results. A core group of experts can lend resources and expertise and maintain the learning library. Leadership must cultivate a test-and-learn culture, in part by penalizing those who act without sufficient evidence.
As your managers become more comfortable with testing, they’ll discover that it paves the way for, rather than throwing up barriers to, promising new ideas.
New framework for Stage 5: autonomous analytics 
Autonomous Actor + Data Pipeline + In-memory + Reactive + Functor + Deep Learning 



Predictive Analytics using Storm, Hadoop, R and AWS

This presentation gives a quick refresher on Storm concepts, however most of the time will be spent discussing a recent project where Storm was a critical part of implementing a predictive analytics use case for an actual customer


This talk provides an overview of the open source Storm system for processing Big Data in realtime. The talk starts with an overview of the technology, including key components: Nimbus, Zookeeper, Topology, Tuple, Trident. The presentation then dives into the complex Big Data architecture in which Storm can be integrated. The result is a compelling stack of technologies including integrated Hadoop clusters, MPP, and NoSQL databases.

Tuesday, December 17, 2013

PALANTIR BIG DATA TECHNOLOGIES - từ quỷ dữ đến thiên thần ?

Là quỷ dữ ?
Trong loạt công ty bị phanh phui dính vào scandal nghe trộm mà tờ Washington Post (7/6/2013) liệt kê (Microsoft, Yahoo, Google, Facebook, PalTalk, AOL, Skype...), người ta không thấy tên hãng phần mềm Palantir. Tuy nhiên, ít người biết rằng, Palantir Technologies vài năm gần đây là một trong những "cánh cửa sau" đóng vai trò quan trọng đặc biệt đối với cộng đồng tình báo Mỹ trong cuộc chiến chống khủng bố nói riêng và rình rập nghe trộm nói chung...

http://www.businessweek.com/magazine/palantir-the-vanguard-of-cyberterror-security-11222011.html



Peter Thiel - người sáng lập Palantir Technologies
Là thiên thần ?
How we’re building an information infrastructure for Typhoon Haiyan response operations
Typhoon Haiyan has claimed the lives of thousands and displaced millions more. Along with other aid organizations from around the world, our disaster response partners Team Rubicon and Direct Relief have mobilized to provide relief to those affected by the storm, and we’ve been working closely with them to support their efforts.

We’ve been hacking away furiously all week to support these efforts. Here’s an update of what we’re already doing and what we have planned.
The Raven interface with data from Tacloban
Raven with live data from Tacloban.
http://www.palantir.com/2013/11/how-were-building-an-information-infrastructure-for-typhoon-haiyan-response-operations/

Nhìn Job để biết Palartir vận hành như thế nào 
https://www.palantir.com/careers/

Tuesday, December 10, 2013

Open resources for active news app and Dashboard


Resources for implementation (active news app)
  https://speakerdeck.com/sritchie/summingbird-at-cufp - new tool bigdata processing (unify Kafka + Hadoop HDFS)
  http://guidetodatamining.com - Machine Learning for developers
  https://help.ubuntu.com/community/AndroidSDK - setup Android dev on ubuntu
  http://blogs.shephertz.com/tag/app42-android-push-sample-using-phonegap/ - pushed news
  http://probcomp.csail.mit.edu/bayesdb/docs/gettingstarted.html - fuzzy database
  https://www.reinvigorate.net/features/ - tracking service
  https://github.com/anchor/pollchat - demo for using Kafka as Scalable PubSub Messaging

 UI/UX for Log Dashboard
  http://techslides.com/over-1000-d3-js-examples-and-demos/
  http://bl.ocks.org/tomerd/1499279 - google style gauges using d3.js
  http://bl.ocks.org/mbostock/1157787 - Small Multiples for table of KPI Metrics monitor
  http://bl.ocks.org/mbostock/1468715 - Visualize how many we have/how many we used
  http://apigee.com/docs/app-services/content/investigate-issues-your-mobile-app - error log filter
  http://bl.ocks.org/mbostock/3885211 - Visualize Browser (KPI with less 20 members)
  http://prcweb.co.uk/lab/circularheat/ - Visualize time and heatmap

UI/UX Pattern for Mobile
http://pttrns.com/ - a curated library of iPhone and iPad user interface patterns
http://www.mobiledesignpatterngallery.com/mobile-patterns.php
http://android.inspired-ui.com/

Case Studies:
http://zite.com/
http://www.fastcompany.com/1736533/personalized-ipad-magazine-zite-learns-you-read-challenges-flipboard

Think more:








Wednesday, December 4, 2013

Làm sao các thông tin có ích (news, musics, movies,...) chủ động đến với 1 nguời cần nó ?



Kết quả ban đầu khi đáng từ film để tìm các từ liên quan (unsupervised training ):
Enter word or sentence (EXIT to break): movie
 Word       Cosine distance
------------------------------------------------------------------------
                                              film 0.726205
                                            movies 0.724130
                                             films 0.704162
                                            remake 0.646792
                                            batman 0.640161
                                    blaxploitation 0.629710
                                            gojira 0.620710
                                          animated 0.615535
                                           cartoon 0.611009
                                              toho 0.606068
                                        highlander 0.605127
                                             kaiju 0.604210
                                          godzilla 0.596378
                                          starring 0.592826
                                        soundtrack 0.58806



trờ về quá khứ tí, cách đây 6 năm, vào những năm 2007, bị ám ảnh bởi mô tả về cách các hàm vận động (functor) giữa các hệ thống do thầy Peter (http://peteg.org/),  thầy huớng dẫn của mình đề cập.
Thực tế đến giờ thắc mắc vì khá trừu tuợng. Lập ra cái blog http://activefunctor.blogspot.com/ để viết ra cho nó hại não tí.

Cùng thời gian này, 1 số các open source về big data (cụ thể là Hadoop - Map Reduce) bắt đầu hình thành sau khi Google publish cái paper này


Tóm lại, thông tin ở dạng phi cấu trúc dạng text/photo (non-structure) thì rất nhiều và đa dạng. Sự phổ biến của social media và mobile apps làm nó số luợng tăng rất nhanh.
Lịch sử đã nói rằng, có cầu thì sẽ có cung. Sự ra đời của các framework về big data , text analysis để giải quyết các vần đề trên là tất yếu.

Hôm nay, đọc vài bài về deep learning
http://gigaom.com/2013/08/16/were-on-the-cusp-of-deep-learning-for-the-masses-you-can-thank-google-later/
http://venturebeat.com/2012/11/25/deep-learning/
http://gigaom.com/2013/11/01/the-gigaom-guide-to-deep-learning-whos-doing-it-and-why-it-matters/

Nếu cho có robot  crawler đi index các thông tin trên facebook thì sao ?

Xem bài post này sẽ rõ, ít nhất mình đã nghĩ ra từ năm 2011 




Why Mobile Ads Don’t Work and how to fix !


Tiếp theo của bài http://www.mc2ads.com/2013/12/loi-va-hai-va-y-tuong-khi-quang-cao-cac.html

Why Mobile Ads Don’t Work ?

Display ads function well in print and on desktop computers. But there’s a growing consensus that they just don’t work on mobile devices. Here are three reasons why:

People Don’t Like Them
Surveys show that people find mobile ads more intrusive than desktop ads, because mobile is a more private venue. In fact, fully four in five say that mobile ads are “unacceptable.”

There’s No Right Side
PC users are conditioned to find ads in the right margin of the screen—they appear that way on Facebook and in Google search results, for example. But mobile screens are too small to have a usable right margin, so ads pop up in unexpected places.

The “Fat Finger” Effect
Advertisers closely track how many users tap on an ad. But many of those taps are inadvertent, because the ads are tiny—so it’s difficult to judge an ad’s effectiveness.

Strategies for mobile ads:

  1. Add convenience
  2. Offer unique value
  3. Provide social value
  4. Beneficial information 
  5. Right products for right people at right place, right time and right demands
Source:
http://hbr.org/2013/03/for-mobile-devices-think-apps-not-ads/ar/1



Source:
http://hbr.org/2013/12/making-mobile-ads-that-work/ar/1

Tuesday, December 3, 2013

Lợi và hại và ý tuởng khi quảng cáo các ứng dụng smartphone (mobile ads)

1 tí khái niệm từ iAB
Location Based Advertising (LBA)
http://www.iabuk.net/sites/default/files/white-paper-docs/Location%20Based%20Advertising%20-%20Whitepaper.pdf

Nói về những cái hại  truớc:

Apple là 1 họa sỹ, và khá ghét các quảng cáo (đơn giản vì các quảng cáo chú trọng vào tiền bạc hơn là khả năng  usable & usability đối với nguời dùng)
Apple không duyệt các app có thu thập dữ liệu hành vi, thuộc loại dành cho trẻ con dưới 13 tuổi, app sẽ bị disapproved khi submit.
http://mobilemarketingmagazine.com/behavioural-ads-banned-ios-kids-apps-category/

Nếu ads không có "beneficial information." , app sẽ bị cấm cửa trên app store. Khả năng bị thu thập thông tin về vị trí là 1 điều nhạy cảm, không ai thích bị theo dõi. 
Privacy, privacy, and privacy !
http://www.informationweek.com/business/apple-bans-location-based-iphone-ads/d/d-id/1086705?

2 bài này nói về các lợi ích chung, tổng quát:
Location Targeting: Perception And Reality
Is location-based advertising right for you?

Đứng trên quan điểm của 1 user, Location Based Advertising sẽ cho thấy các ích lợi như:

  • Nguời sẽ đuợc nhận thông tin về 1 thông tin quảng cáo (giảm giá, khuyến mãi, sale off, ...) theo real-time (như sms) theo đúng vùng targeting (location).
  • Nếu cách trình bày tốt, 1 banner đẹp vài đúng thời gian+địa điểm sẽ là 1 thông điệp vô cùng giá trị (không phải vô cảm / phiền hà như SMS) 

Ý tuởng gia tăng lợi ích của LBA:



  • Mang thông điệp hay, đúng thời gian và địa điểm. 
http://pttrns.com/categories/4-notifications
  • Gia tăng tính usability đối với user



Xây dựng 1 browser chỉ dành để hiển thị quảng cáo ?