Video Episodes:
92 Views
18:06:09 10/30/09
Christian Kreibich: Infiltrating a Botnet
[LESS INFO] 92 VIEWS | ADDED 22:06:09 10/30/09
Christian Kreibich: Infiltrating a Botnet The Association for Computing Machinery - HP, Oak Room, Cupertino ICIR research scientist Christian Kreibich presents a study on Internet spam. He describes his team's infiltration of the famous "Storm" botnet, documenting its success rates as well as ways to protect users from it.
26 Views
17:26:36 09/23/09
Sean Murphy: The Limits of 'I'll Know It When I See It'
[LESS INFO] 26 VIEWS | ADDED 21:26:36 09/23/09
Sean Murphy: The Limits of 'I'll Know It When I See It' The Association for Computing Machinery - Hewlett Packard How can teams become more effective at making decisions quickly and reaching a working consensus?Sean Murphy explores team decision making, the challenges in blending human expertise, and the use of software tools to maintain a shared situational awareness and improve the timeliness and quality of decisions.Murphy explores problems that engineering teams and startups wrestle with as they move from hunches to working technology to product to repeatable, scalable processes. He also covers how groups make decisions and how to systemically approach the management of exploration and verification cycles in product and customer development situations.Whether you are a solo entrepreneur, a talented engineer, a manager or a CEO you will learn some practical tips for leading your team to successful decision making.
67 Views
13:58:57 09/16/09
Hierarchical Temporal Memory: Subutai Ahmad
[LESS INFO] 67 VIEWS | ADDED 17:58:57 09/16/09
Hierarchical Temporal Memory: Subutai Ahmad The Association for Computing Machinery - NASA Exploration Center Many basic tasks of perception, pattern recognition, and motor control are easy for people but impossible for computers. No existing machine can recognize pictures, understand language, or swiftly navigate through a cluttered room. The gap hints at multiple business opportunities and represents a new industry in intelligent computing. Numenta is creating a technology, called Hierarchical Temporal Memory (HTM), based on the principles of the neocortex.The theory, developed by Jeff Hawkins and described in his book entitled On Intelligence, explains how the hierarchical structure of the neocortex builds a model of its world and uses this model for inference and prediction. Numenta has released an open research platform that allows software programmers to apply this theory to a variety of problems. We are have started applying the technology to a number of domains, including vision, text analytics, and web analytics.Subutai Ahmad describes the basics of HTM theory, its applications, and Numenta's collaborative business model. He focuses on potential applications of HTM in the data mining field.
57 Views
17:09:16 08/25/09
The Facebook Era: Clara Shih
[LESS INFO] 57 VIEWS | ADDED 21:09:16 08/25/09
The Facebook Era: Clara Shih The Association for Computing Machinery - Hewlett Packard The Facebook Era is a newly released book about how social networking sites like Facebook and Twitter are ushering in a new era of business, relationships, and culture. Additionally, it discusses what companies need to do strategically and tactically to adapt and thrive in this new environment.The last decade was about the World Wide Web of information and the power of linking content pages. Today, it's about the World Wide Web of people and the power of the social graph. We are undergoing a radical transformation as traditional one-sided CRM gives way to bi-directional visibility and access, and an unprecedented degree of trusted online identity and access to people are forever changing human relationships and business transactions.Facebook, Twitter, and LinkedIn are changing everything we thought we knew about sales, marketing, and product development -- and empowering companies with new tools, insights, and ability to transform customers into true partners and your most effective sales force yet.
73 Views
14:08:07 08/24/09
Josh Herbach: PLANET, MapReduce, and Decision Trees
[LESS INFO] 73 VIEWS | ADDED 18:08:07 08/24/09
Josh Herbach: PLANET, MapReduce, and Decision Trees The Association for Computing Machinery - NASA Ames Research Center Classification and regression tree learning on massive datasets is a common data mining task at Google, yet many state of the art tree learning algorithms require training data to reside in memory on a single machine. While more scalable implementations of tree learning have been proposed, they typically require specialized parallel computing architectures. In contrast, the majority of Google's computing infrastructure is based on commodity hardware.In this paper, Herbach describes PLANET: a scalable distributed framework for learning tree models over large datasets. PLANET defines tree learning as a series of distributed computations, and implements each one using the MapReduce model of distributed computation.Herbach shows how this framework supports scalable construction of classification and regression trees, as well as ensembles of such models.Herbach also discusses the benefits and challenges of using a MapReduce compute cluster for tree learning, and demonstrate the scalability of this approach by applying it to a real world learning task from the domain of computational advertising.
57 Views
16:16:02 07/14/09
Jeff Scargle: Optimal Segmentation Analysis of Event Data
[LESS INFO] 57 VIEWS | ADDED 20:16:02 07/14/09
Jeff Scargle: Optimal Segmentation Analysis of Event Data The Association for Computing Machinery - NASA Exploration Center Jeffrey D. Scargle, research astrophysicist in the Planetary Systems Branch, Astrobiology and Space Science Division at NASA Ames Research Center, presents a practical algorithm for optimal segmentation analysis of event data.Useful information from data sets consisting of points, or other measurements, distributed over a data space of dimension 1 or higher, can be extracted using data segmentation. A simple algorithm based on an old dynamic programming concept provides an effective and practical approach to such problems.This non-parametric approach has many desired properties: no artificial limits on the scales or resolution of signals that can be recovered, objectivity, automation, elimination of noise while conserving the valid information in the data, adaptability to multivariate problems, and ability to incorporate variable instrumental efficiency and auxiliary information.The algorithm is demonstrated on astronomical data, including gamma-ray data from the NASA Fermi Gamma Ray Space Telescope, the Sloan Digital Sky Survey of large scale structure of the Universe, and otherastronomical observatories.
45 Views
16:44:34 06/30/09
Klaus Roder: The Architecture of Mashups
[LESS INFO] 45 VIEWS | ADDED 20:44:34 06/30/09
Klaus Roder: The Architecture of Mashups The Association for Computing Machinery - Hewlett Packard IBM software engineer Klaus Roder explains what Mashups are, where they can be used in the enterprise, why they are important for the enterprise and how to quickly build Mashups with the IBM Mashup Center.
85 Views
17:55:07 06/16/09
Dan Steinberg on Interaction Detection with TreeNet
[LESS INFO] 85 VIEWS | ADDED 21:55:07 06/16/09
Dan Steinberg on Interaction Detection with TreeNet The Association for Computing Machinery - Yahoo! Recent advances in machine learning technology make it possible to determine definitively whether or not interactions of any degree need to be included in a predictive model.We can thus establish conclusively, for example, for a given set of predictors, that an additive model (one with no interactions) cannot be improved upon with interactions. Or alternatively, one might prove that a model with interactions will outperform a model without them.Further, we can now identify precisely which interactions are supported by the data, and also the degree of interaction, even in very high dimensional data. The tools we use to achieve these results are extensions of Stanford University Professor Jerome Friedman's TreeNet, developed by the authors and embedded in the Salford Systems TreeNet 2.0 Pro Ex product.Steinberg illustrates the concepts in the context of a real world regression model where we are quickly able to identify all the important interactions with a modest number of boosted tree ensemble models.
123 Views
09:31:47 05/28/09
Michael Bowles: Neural Nets & Rule-Based Trading Systems
[LESS INFO] 123 VIEWS | ADDED 13:31:47 05/28/09
Michael Bowles: Neural Nets & Rule-Based Trading Systems The Association for Computing Machinery - SAP LABS The 45% drop in the US equity markets has caused even stalwarts to question the wisdom of the "buy and hold" strategy. But rule-based approaches for deciding when to buy or sell suffer the same problem. Sometimes they work and sometimes they don't. In this presentation, Dr. Mike Bowles shows how familiar data-mining tools can be used to derive a robust algorithmic trading system.A simple rule-based approach trend-following system serves as a starting point. He looks at that system's characteristics and then employs a neural net to predict which of the system's trades should be taken and which ones should be skipped.Bowles demonstrates that this significantly improves the performance of the trading system (Sharpe's ratio of 1.6 to Sharpe's ratio 3.6). This example illustrates one way in which data mining tools have proven useful to practitioners of quantitative finance.
115 Views
16:36:19 04/28/09
Empowering Internet Users: Two Ideas to Reshape Broadband
[LESS INFO] 115 VIEWS | ADDED 20:36:19 04/28/09
Empowering Internet Users: Two Ideas to Reshape Broadband The Association for Computing Machinery - Association for Computing Machinery Google Policy Analyst Derek Slater discusses the state of broadband policy in the U.S and the importance of sustaining the Internet as an open platform for consumer choice and end-user innovation.Slater provides an overview of possible legislation to ensure robust access to the open Internet. He also presents two novel ideas that could help transform broadband policy in fundamental ways: customer ownership of last mile fiber, and Measurement Lab, a new research platform for broadband testing tools.
347 Views
18:34:05 04/22/09
Mark Schwabacher: Fault Detection in Rocket Engines
[LESS INFO] 347 VIEWS | ADDED 22:34:05 04/22/09
Mark Schwabacher: Fault Detection in Rocket Engines The Association for Computing Machinery - Association for Computing Machinery Using Supervised and Unsupervised Learning to Detect and Isolate Faults in Rocket EnginesWe have used two classes of algorithms to automatically detect and isolate faults in rocket engines. The first class of algorithms is known as supervised learning algorithms. Examples of supervised learning algorithms include decision trees and support vector machines. These algorithms require training data consisting of a large number of labeled examples of sensor data from both nominal operation and from failures.They learn a model that can distinguish among nominal data and data from each failure mode, and can thus perform both fault detection and fault isolation. In real rocket engine sensor data, there are not enough failures to allow supervised learning to be used, so we have only been able to use this class of algorithms with simulated data. The second class of algorithms is known as unsupervised anomaly detection algorithms.These algorithms are trained using only nominal data, learn a model of the nominal data, and signal a failure when future data fails to match the model. They are not able to identify the failure mode, but they can be trained using real data that does not include any failures. Examples of unsupervised anomaly detection algorithms include the Inductive Monitoring System (IMS), Orca, GritBot, and one-class support vector machines.We will present results of applying unsupervised anomaly detection to detecting faults in real data from the Space Shuttle Main Engine, and of applying supervised learning to detecting and isolating faults in simulated data from the J-2X rocket engine.
152 Views
17:02:49 03/26/09
Yuan Yu on DryadLINQ
[LESS INFO] 152 VIEWS | ADDED 21:02:49 03/26/09
Yuan Yu on DryadLINQ The Association for Computing Machinery - Association for Computing Machinery Yuan Yu describes the programming model, the design, and the implementation of the DryadLINQ system.DryadLINQ provides a simple, powerful, and elegant programming environment for large-scale data parallel computing. It combines the .NET Language Integrated Query (LINQ) and the Dryad distributed execution engine.A DryadLINQ program is a sequential program (written in C#, VB, or F#) composed of LINQ expressions performing arbitrary side effect-free transformations on datasets, and can be written and debugged using standard .NET development tools.The DryadLINQ system automatically and transparently translates the data-parallel portions of the program into a distributed execution plan and executes it using the Dryad execution engine, which ensures reliable, scalable execution of the plan.
381 Views
17:54:37 03/16/09
Greg Makowski: Event Lift Forecasting
[LESS INFO] 381 VIEWS | ADDED 21:54:37 03/16/09
Greg Makowski: Event Lift Forecasting The Association for Computing Machinery - Association for Computing Machinery Embedded Automatic Model Training and Forecasting in an Enterprise Software ApplicationHow can the process of Knowledge Discovery in Databases be automated, competitive and reliable? One approach is to focus on a narrow vertical market application, with known data sources and data feeds. Then you can automate the Exploratory Data Analysis (EDA) and Preprocessing phases.But how do you automate the selection of training data? Can the enterprise application be installed and configured at a variety of clients without a Senior Knowledge Discovery Engineer? How can you minimize "worst case" results of such a system when used by a business user going through their normal business role? How can you deeply investigate and model "business values" (i.e. things that can get an end user promoted or fired) into the core of the data mining algorithms?This talk answers these questions and more. The patent-pending application, ELF, is an enterprise application in the retail supply chain vertical market. Before the development of this system, one enterprise application was used to lay out a weekly newspaper flier three weeks before the sales event, which in turn fed data into a replenishment application.The replenishment application kept products on the store shelves, with a minimal amount of over stock and under stock. The pain point was that the retail buyer would have to manually estimate the the sales lift, or the multiplier increase in sales, for every item for every store. While human expertise can be great, it isn't as scalable when applied to a sales event with 1,000 - 4,000 items on sale in 6,000 stores.ELF (Event Lift Forecasting) would import data from a planned event and automatically analyze and forecast the lift for each store-item combination. Data elements used included pricing, placement in the flier, store geography and demographics, seasonality, and product hierarchy.The resulting ELF system produced a 8-30% reduction in over and under stock costs, which is very significant in terms of the low profit margins in the supply chain industry.
146 Views
12:16:37 02/26/09
Bill Byrne: Speech UIs for Mobile Applications
[LESS INFO] 146 VIEWS | ADDED 17:16:37 02/26/09
Bill Byrne: Speech UIs for Mobile Applications The Association for Computing Machinery - Association for Computing Machinery The growing number of mobile applications with speech is drawing renewed interest in a series of interesting and challenging UI puzzles.In this talk, Googler Bill Byrne gives a brief overview of the various Google projects that involve speech. He also describes his work on the design and research of mobile apps with speech.
179 Views
15:34:03 01/26/09
Logic Paradigm for C++
[LESS INFO] 179 VIEWS | ADDED 20:34:03 01/26/09
Logic Paradigm for C++ The Association for Computing Machinery - Association for Computing Machinery The Logic paradigm (LP) is a powerful, Turing-complete programming paradigm that has seen little representation in mainstream languages as compared to the Object-Oriented, Imperative and Functional paradigms.LP is an important approach in Computer Science towards what is sometimes referred to as the Holy-Grail of programming "The user states the problem, the computer solves it".Origins of the theory underlying Logic dates back to about 300 B.C. when Aristotle founded Formal Logic to bring rigor to logical inferencing. The theory matured into Modern Logic more recently (early 1900s) when Russell & Whitehead showed that all of Mathematics could be reduced to Logic.This talk will provide an introduction to the basics of LP in C++, followed by plenty of examples to develop a feel for and start thinking in terms of this paradigm. We will also observe how it naturally blends with the other paradigms.And finally, we shall broaden the scope to see how powerful multiparadigm solutions emerge when programmers can freely mix and match paradigms.It will be evident from this talk, how a clean and deep integration of LP makes C++ a fountainhead for the many design patterns yet to be discovered - SF Bay Area ACM
885 Views
14:03:39 12/11/08
Francesco Cesarini: Erlang Concurrency, What's the Fuss?
[LESS INFO] 885 VIEWS | ADDED 19:03:39 12/11/08
Francesco Cesarini: Erlang Concurrency, What's the Fuss? The Association for Computing Machinery - Association for Computing Machinery Erlang's concurrency model has been used in commercial systems for well over 15 years, but what differentiates it from other technologies? What are the constructs, what makes them so powerful and scalable, and when using them, what change in mindset is required from the developers?What makes Erlang an excellent choice when developing with SMP in mind? This talk, based on 15 years of concurrent functional programming in Erlang, attempts to answer all these questions.With live demos, Erlang expert Francesco Cesarini provides benchmarks on process creation and message passing. He also covers the constructs which provide the concurrency model and the fault tolerance built around it.He gives practical examples of IM and SMS based systems which make the correct use of the concurrency model, provides case studies of systems that work, and ones that don't.The talk concludes with Cesarini's experiences of using Erlang on multi-processor machines, and the challenges this boost in performance is giving developers - SF Bay Area ACM
09/16/09