The sharing of resources about Statistical Learning Theory and Machine Learning(includeing SVM,Semi-Supervised Learning,Ensemble Learning,Clustering) ,welcome to contact and communicate with me: Email:,QQ:112035246,

Sunday, April 25, 2010

ICML 2010 - Accepted Papers

The following 152 papers have been accepted:

•16: Large Graph Construction for Scalable Semi-supervised Learning
Wei Liu, Junfeng He, Shih-Fu Chang

•23: Boosting Classifiers with Tightened L0-Relaxation Penalties
Noam Goldberg, Jonathan Eckstein

•26: Variable Selection in Model-Based Clustering: To Do or To Facilitate
Leonard Poon, Nevin Zhang, Tao Chen, Yi Wang

•28: Modeling Interaction via the Principle of Maximum Causal Entropy
Brian Ziebart, Drew Bagnell, Anind Dey

•35: Multi-Task Learning of Gaussian Graphical Models
Jean Honorio, Luis Ortiz, Dimitris Samaras

•45: Spherical Topic Models
Joseph Reisinger, Austin Waters, Bryan Silverthorn, Raymond Mooney

•52: Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
Gavin Taylor, Marek Petrik, Ron Parr, Shlomo Zilberstein

•76: Multi-agent Learning Experiments on Repeated Matrix Games
Bruno Bouzy, Marc Métivier

•77: Probabilistic Backward and Forward Reasoning in Stochastic Relational Worlds
Tobias Lang, Marc Toussaint

•78: Causal filter selection in microarray data
Gianluca Bontempi, Patrick Meyer

•87: A Conditional Random Field for Multi-Instance Learning
Thomas Deselaers, Vittorio Ferrari

•99: Supervised Aggregation of Classifiers using Artificial Prediction Markets
Nathan Lay, Adrian Barbu

•100: 3D Convolutional Neural Networks for Human Action Recognition
Shuiwang Ji, Wei Xu, Ming Yang, Kai Yu

•107: Asymptotic Analysis of Generative Semi-Supervised Learning
Joshua Dillon, Krishnakumar Balasubramanian, Guy Lebanon

•115: Restricted Boltzmann Machines are Hard to Approximately Evaluate or Simulate
Phil Long, Rocco Servedio

•117: Learning from Noisy Side Information by Generalized Maximum Entropy Model
Tianbao Yang, Rong Jin

•119: Finding Planted Partitions in Nearly Linear Time using Arrested Spectral Clustering
Nader Bshouty, Phil Long

•123: The Elastic Embedding Algorithm for Dimensionality Reduction
Miguel Carreira-Perpinan, Jianwu Zeng

•125: Two-Stage Learning Kernel Algorithms
Corinna Cortes, Mehryar Mohri, Afshin Rostamizadeh

•132: Robust Graph Mode Seeking by Graph Shift
Hairong Liu, Shuicheng Yan

•137: Multiscale Wavelets on Trees, Graphs and High Dimensional Data: Theory and Applications to Semi Supervised Learning
Matan Gavish, Boaz Nadler, Ronald Coifman

•149: Deep Supervised T-Distributed Embedding
Renqiang Min, Zineng Yuan, Laurens van der Maaten, Anthony Bonner, Zhaolei Zhang

•168: A Nonparametric Information Theoretic Clustering Algorithm
Lev Faivishevsky, Jacob Goldberger

•170: Gaussian Process Change Point Models
Yunus Saatci, Ryan Turner, Carl Rasmussen

•175: Dynamical Products of Experts for Modeling Financial Time Series
Yutian Chen, Max Welling

•176: The Margin Perceptron with Unlearning
Constantinos Panagiotakopoulos, Petroula Tsampouka

•178: Sequential Projection Learning for Hashing with Compact Codes
Jun Wang, Sanjiv Kumar, Shih-Fu Chang

•179: Generalization Bounds for Learning Kernels
Corinna Cortes, Mehryar Mohri, Afshin Rostamizadeh

•180: Modeling Transfer Learning in Human Categorization with the Hierarchical Dirichlet Process
Kevin Canini, Tom Griffiths

•187: Convergence of Least Squares Temporal Difference Methods Under General Conditions
Huizhen Yu

•191: Classes of Multiagent Q-learning Dynamics with epsilon-greedy Exploration
Michael Wunder, Michael Littman, Monica Babes

•195: Estimation of (near) low-rank matrices with noise and high-dimensional scaling
Sahand Negahban, Martin Wainwright

•196: A Simple Algorithm for Nuclear Norm Regularized Problems
Martin Jaggi, Marek Sulovský

•197: On Sparse Nonparametric Conditional Covariance Selection
Mladen Kolar, Ankur Parikh, Eric Xing

•202: Exploiting Data-Independence for Fast Belief-Propagation
Julian McAuley, Tiberio Caetano

•207: One-sided Support Vector Regression for Multiclass Cost-sensitive Classification
Han-Hsing Tu, Hsuan-Tien Lin

•219: OTL: A Framework of Online Transfer Learning
Peilin Zhao, Steven C.H. Hoi

•223: SVM Classifier Estimation from Group Probabilities
Stefan Rueping

•227: Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets
Mingkui Tan, Li Wang, Ivor Tsang

•233: Total Variation and Cheeger Cuts
Arthur Szlam, Xavier Bresson

•235: Learning Temporal Graphs for Relational Time-Series Analysis
Yan Liu, Alexandru Niculescu-Mizil, Aurelie Lozano, Yong Lu

•238: Online Streaming Feature Selection
Kui Yu, Xindong Wu, Hao Wang

•242: Making Large-Scale Nystrom Approximation Possible
Mu Li, James Kwok, Bao-Liang Lu

•246: Particle Filtered MCMC-MLE with Connections to Contrastive Divergence
Arthur Asuncion, Qiang Liu, Alex Ihler, Padhraic Smyth

•247: Feature Selection as a one-player game
Romaric Gaudel, Michele Sebag

•248: The Translation-invariant Wishart-Dirichlet Process for Clustering Distance Data
Volker Roth, Thomas Fuchs, Julia Vogt, Sandhya Prabhakaran

•259: Online Prediction with Privacy
Jun Sakuma

•263: Fast boosting using adversarial bandits
Róbert Busa-Fekete, Balazs Kegl

•268: Robust Formulations for Handling Uncertainty in Kernel Matrices
Sahely Bhadra, Sourangshu Bhattacharya, Chiranjib Bhattacharyya, Aharon Ben-Tal

•269: Bayesian Multi-Task Reinforcement Learning
Mohammad Ghavamzadeh, Alessandro Lazaric

•275: A New Analysis of Co-Training
Wei Wang, Zhi-Hua Zhou

•279: Clustering processes
Daniil Ryabko

•280: COFFIN : A Computational Framework for Linear SVMs
Soeren Sonnenburg, Vojtech Franc

•284: Multiagent Inductive Learning: an Argumentation-based Approach
Santiago Ontanon, Enric Plaza

•285: Active Risk Estimation
Christoph Sawade, Niels Landwehr, Steffen Bickel, Tobias Scheffer

•286: Heterogeneous Continuous Dynamic Bayesian Networks with Flexible Structure and Inter-Time Segment Information Sharing
Frank Dondelinger, Sophie Lebre, Dirk Husmeier

•295: Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Carlton Downey, Scott Sanner

•297: Surrogating the surrogate: accelerating Gaussian-process-based global optimization with a mixture cross-entropy algorithm
Rémi Bardenet, Balazs Kegl

•298: Random Spanning Trees and the Prediction of Weighted Graphs
Nicolo Cesa-Bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella

•303: Analysis of a Classification-based Policy Iteration Algorithm
Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos

•310: Unsupervised Risk Stratification in Clinical Datasets: Identifying Patients at Risk of Rare Outcomes
Zeeshan Syed, Ilan Rubinfeld

•311: Gaussian Covariance and Scalable Variational Inference
Matthias Seeger

•319: Efficient Learning with Partially Observed Attributes
Ohad Shamir, Nicolo Cesa-Bianchi, Shai Shalev-Shwartz

•330: Boosting for Regression Transfer
David Pardoe, Peter Stone

•331: Label Ranking under Ambiguous Supervision for Learning Semantic Correspondences
Antoine Bordes, Nicolas Usunier, Jason Weston

•333: From Transformation-Based Dimensionality Reduction to Feature Selection
Mahdokht Masaeli, Glenn Fung, Jennifer Dy

•336: Least-Squares λ Policy Iteration: Bias-Variance Trade-off in Control Problems
Christophe Thiery, Bruno Scherrer

•342: Multiple Non-Redundant Spectral Clustering Views
Donglin Niu, Jennifer Dy

•344: Large Scale Max-Margin Multi-Label Classification with Prior Knowledge about Densely Correlated Labels
Bharath Hariharan, S.V.N. Vishwanathan, Manik Varma

•347: Fast Neighborhood Subgraph Pairwise Distance Kernel
Fabrizio Costa, Kurt De Grave

•352: Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity
Seyoung Kim, Eric Xing

•353: Label Ranking Methods based on the Plackett-Luce Model
Weiwei Cheng, Krzysztof Dembczynski, Eyke Huellermeier

•359: A DC Programming Approach for Sparse Eigenvalue Problem
Mamadou Thiao, Tao Pham Dinh, Hoai An Le Thi

•366: Dictionary Selection for Sparse Representation
Andreas Krause, Volkan Cevher

•370: Deep networks for robust visual recognition
Yichuan Tang, Chris Eliasmith

•371: A Stick-Breaking Construction of the Beta Process
John Paisley, Lawrence Carin

•374: Local Minima Embedding
Minyoung Kim, Fernando De la Torre

•376: Risk minimization, probability elicitation, and cost-sensitive SVMs
Hamed Masnadi-Shirazi, Nuno Vasconcelos

•378: Continuous-Time Belief Propagation
Tal El-Hay, Ido Cohn, Nir Friedman, Raz Kupferman

•384: Measuring Article Influence Without Citations
Sean Gerrish, David Blei

•387: Power Iteration Clustering
Frank Lin, William Cohen

•397: The IBP Compound Dirichlet Process and its Application to Focused Topic Modeling
Sinead Williamson, Chong Wang, Katherine Heller, David Blei

•406: Budgeted Distribution Learning of Belief Net Parameters
Barnabas Poczos, Russell Greiner, Csaba Szepesvari, Liuyang Li

•410: Efficient Selection of Multiple Bandit Arms: Theory and Practice
Shivaram Kalyanakrishnan, Peter Stone

•412: Gaussian Process Multiple Instance Learning
Minyoung Kim, Fernando De la Torre

•416: Proximal Methods for Sparse Hierarchical Dictionary Learning
Rodolphe Jenatton, Julien Mairal, Guillaume Obozinski, Francis Bach

•420: Conditional Topic Random Fields
Jun Zhu, Eric Xing

•421: On the Consistency of Ranking Algorithms
John Duchi, Lester Mackey, Michael Jordan

•422: Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
Niranjan Srinivas, Andreas Krause, Sham Kakade, Matthias Seeger

•429: Implicit Online Learning
Brian Kulis, Peter Bartlett

•432: Rectified Linear Units Improve Restricted Boltzmann Machines
Vinod Nair, Geoffrey Hinton

•433: Budgeted Learning from Data Streams
Ryan Gomes, Andreas Krause

•436: Interactive Submodular Set Cover
Andrew Guillory, Jeff Bilmes

•438: A fast natural Newton method
Nicolas Le Roux, Andrew Fitzgibbon

•441: Learning Deep Boltzmann Machines using Adaptive MCMC
Ruslan Salakhutdinov

•442: Internal Rewards Mitigate Agent Boundedness
Jonathan Sorg, Satinder Singh, Richard Lewis

•446: Learning optimally diverse rankings over large document collections
Aleksandrs Slivkins, Filip Radlinski, Sreenivas Gollapudi

•449: Learning Fast Approximations of Sparse Coding
Karol Gregor, Yann LeCun

•451: Boosted Backpropagation Learning for Training Deep Modular Networks
Alexander Grubb, Drew Bagnell

•453: Convergence, Targeted Optimality, and Safety in Multiagent Learning
Doran Chakraborty, Peter Stone

•454: Improved Local Coordinate Coding using Local Tangents
Kai Yu, Tong Zhang

•458: Deep learning via Hessian-free optimization
James Martens

•464: Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis
Daniel Lizotte, Michael Bowling, Susan Murphy

•468: Cognitive Models of Test-Item Effects in Human Category Learning
Xiaojin Zhu, Bryan Gibson, Kwang-Sung Jun, Tim Rogers

•473: Online Learning for Group Lasso
Haiqin Yang, Zenglin Xu, Irwin King, Michael Lyu

•475: Generalizing Apprenticeship Learning across Hypothesis Classes
Thomas Walsh, Kaushik Subramanian, Michael Littman, Carlos Diuk

•481: Projection Penalties: Dimension Reduction without Loss
Yi Zhang, Jeff Schneider

•493: Application of Machine Learning To Epileptic Seizure Detection
Ali Shoeb

•495: Hilbert Space Embeddings of Hidden Markov Models
Le Song, Byron Boots, Sajid Saddiqi, Geoffrey Gordon, Alex Smola

•502: Learning Markov Logic Networks Using Structural Motifs
Stanley Kok, Pedro Domingos

•504: Metric Learning to Rank
Brian McFee, Gert Lanckriet

•505: Collective Link Prediction in Multiple Heterogenous Domains
Bin Cao, Nathan Liu, Qiang Yang

•518: On Non-identifiability of Bayesian Matrix Factorization Models
Shinichi Nakajima, Masashi Sugiyama

•520: On learning with kernels for unordered pairs
Martial Hue, Jean-Philippe Vert

•521: Robust Subspace Segmentation by Low-Rank Representation
Guangcan Liu, Zhouchen Lin, Yong Yu

•522: Structured Output Learning with Indirect Supervision
Ming-Wei Chang, Vivek Srikumar, Dan Goldwasser, Dan Roth

•523: Bayesian Nonparametric Matrix Factorization for Recorded Music
Matthew Hoffman, David Blei, Perry Cook

•532: Learning the Linear Dynamical System with ASOS
James Martens

•537: Bottom-Up Learning of Markov Network Structure
Jesse Davis, Pedro Domingos

•540: Simple and Efficient Multiple Kernel Learning By Group Lasso
Zenglin Xu, Rong Jin, Haiqin Yang, Irwin King, Michael Lyu

•544: Active Learning for Networked Data
Mustafa Bilgic, Lilyana Mihalkova, Lise Getoor

•546: Model-based reinforcement learning with nearly tight exploration complexity bounds
Istvan Szita, Csaba Szepesvari

•549: Forgetting Counts: Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process
Nicholas Bartlett, David Pfau, Frank Wood

•551: Distance Dependent Chinese Restaurant Processes
David Blei, Peter Frazier

•553: Mixed Membership Matrix Factorization
Lester Mackey, David Weiss, Michael Jordan

•554: An Analysis of the Convergence of Graph Laplacians
Daniel Ting

•556: An Efficient and General Augmented Lagrangian Algorithm for Learning Low-Rank Matrices
Ryota Tomioka, Taiji Suzuki, Masashi Sugiyama, Hisashi Kashima

•562: A scalable trust-region algorithm with application to mixed-norm regression
Dongmin Kim, Suvrit Sra, Inderjit Dhillon

•568: Learning Programs: A Hierarchical Bayesian Approach
Percy Liang, Michael Jordan, Dan Klein

•569: Multi-Class Pegasos on a Budget
Zhuang Wang, Koby Crammer, Slobodan Vucetic

•571: Inverse Optimal Control with Linearly Solvable MDPs
Krishnamurthy Dvijotham, Emanuel Todorov

•576: Telling cause from effect based on high-dimensional observations
Dominik Janzing, Patrik Hoyer, Bernhard Schoelkopf

•582: Mining Clustering Dimensions
Sajib Dasgupta, Vincent Ng

•586: Learning Tree Conditional Random Fields
Joseph Bradley, Carlos Guestrin

•587: Learning efficiently with approximate inference via dual losses
Ofer Meshi, David Sontag, Tommi Jaakkola, Amir Globerson

•588: Approximate Predictive Representations of Partially Observable Systems
Doina Precup, Monica Dinculescu

•589: Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains
Krzysztof Dembczynski, Weiwei Cheng, Eyke Huellermeier

•592: Non-Local Contrastive Objectives
David Vickrey, Cliff Lin, Daphne Koller

•593: Constructing States for Reinforcement Learning
M. M. Mahmud

•596: Graded Multilabel Classification: The Ordinal Case
Weiwei Cheng, Krzysztof Dembczynski, Eyke Huellermeier

•598: Finite-Sample Analysis of LSTD
Mohammad Ghavamzadeh, Alessandro Lazaric, Remi Munos

•601: On the Interaction between Norm and Dimensionality: Multiple Regimes in Learning
Percy Liang, Nathan Srebro

•605: Learning Hierarchical Riffle Independent Groupings from Rankings
Jonathan Huang, Carlos Guestrin

•620: Active Learning for Multi-Task Adaptive Filtering
Abhay Harpale, Yiming Yang

•627: Toward Off-Policy Learning Control with Function Approximation
Hamid Maei, Csaba Szepesvari, Shalabh Bhatnagar, Richard Sutton

•628: Fast and smooth: Accelerated dual decomposition for MAP inference
Vladimir Jojic, Stephen Gould, Daphne Koller

•636: Sparse Gaussian Process Regression via L1 Penalization
Feng Yan, Yuan Qi

•638: A theoretical analysis of feature pooling in vision algorithms
Y-Lan Boureau, Jean Ponce, Yann LeCun

•642: Comparing Clusterings in Space
Michael Coen, Hidayath Ansari, Nathanael Fillmore

•643: Discriminative Semi-Supervised Learning by Encouraging Generative Models to Discover Relevant Latent Representations
Gregory Druck, Andrew McCallum

•652: Nonparametric Return Density Estimation Reinforcement Learning
Tetsuro Morimura, Masashi Sugiyama, Hisashi Kashima, Hirotaka Hachiya, Toshiyuki Tanaka

•654: Should one compute the Temporal Difference fix point or minimize the Bellman Residual?
Bruno Scherrer

Sunday, February 28, 2010

South University of Science and Technology of China


创建中的南方科技大学地处中国改革开放的第一个经济特区——深圳。深圳是中国对外开放的窗口,是连接香港与中国内地的纽带和桥梁,是在改革开放三十 年中迅速崛起的现代化新兴城市。 南方科技大学是在中国高等教育改革发展的宏观背景下,深圳市人民政府落实《珠江三角洲地区改革发展规划纲要(2008-2020)》,以新的思维和机制筹 建的一所新大学。南方科技大学以理工学科为主,兼有管理学科及部分文科专业。学校将借鉴国内外大学的成功经验,以亚洲一流的标准组建每一个专业、系和学 院,及相应的研究室(所),建成小规模高质量的研究型大学。

新时期国家明确了深圳作为国家综合改革配套试验区、全国经济中心城市、国家创新型城市和国际化城市的重要定位。深圳将继续承担探索科学发展模式,深 化改革先行先试的重要使命。南方科技大学是深圳继改革开放30年在经济体制改革上取得重要贡献之后,为国家探索建设世界一流大学新路,尝试中国特色创新人 才培养新模式,并通过建设一流大学推动区域经济发展,实现中华民族复兴的创新实践。

南方科技大学的办学经费由深圳市政府财政拨付。同时拟启动成立南方科技大学基金会,广泛募集来自社会和民间的资金支持,逐步形成政府资金投入为主、 办学经费来源多样化的格局。


校徽的核心部分是一把火炬,象征南方科技大学的使命:为高等教育改革探索出一条新路。校徽背景为渐变的天青色, 映衬火炬的照亮效果;此处的天青背景取自汝窑瓷釉,是中国传统审美文化崇尚的色彩。

校名“南方科技大学”六个字的书法选自中国历史上褚遂良、柳公权、颜真卿等书法大家的作品,校名书写的独特组合 方式,代表着学校“博采众家之长”的精神。

Friday, February 26, 2010



Friday, July 24, 2009

Science, Philosophy and Religion(Einstein)

It would not be difficult to come to an agreement as to what we understand by science. Science is the century-old endeavor to bring together by means of systematic thought the perceptible phenomena of this world into as thoroughgoing an association as possible. To put it boldly, it is the attempt at the posterior reconstruction of existence by the process of conceptualization. But when asking myself what religion is I cannot think of the answer so easily. And even after finding an answer which may satisfy me at this particular moment, I still remain convinced that I can never under any circumstances bring together, even to a slight extent, the thoughts of all those who have given this question serious consideration.

At first, then, instead of asking what religion is I should prefer to ask what characterizes the aspirations of a person who gives me the impression of being religious: a person who is religiously enlightened appears to me to be one who has, to the best of his ability, liberated himself from the fetters of his selfish desires and is preoccupied with thoughts, feelings, and aspirations to which he clings because of their superpersonalvalue. It seems to me that what is important is the force of this superpersonal content and the depth of the conviction concerning its overpowering meaningfulness, regardless of whether any attempt is made to unite this content with a divine Being, for otherwise it would not be possible to count Buddha and Spinoza as religious personalities. Accordingly, a religious person is devout in the sense that he has no doubt of the significance and loftiness of those superpersonal objects and goals which neither require nor are capable of rational foundation. They exist with the same necessity and matter-of-factness as he himself. In this sense religion is the age-old endeavor of mankind to become clearly and completely conscious of these values and goals and constantly to strengthen and extend their effect. If one conceives of religion and science according to these definitions then a conflict between them appears impossible. For science can only ascertain what is, but not what should be, and outside of its domain value judgments of all kinds remain necessary. Religion, on the other hand, deals only with evaluations of human thought and action: it cannot justifiably speak of facts and relationships between facts. According to this interpretation the well-known conflicts between religion and science in the past must all be ascribed to a misapprehension of the situation which has been described.

For example, a conflict arises when a religious community insists on the absolute truthfulness of all statements recorded in the Bible. This means an intervention on the part of religion into the sphere of science; this is where the struggle of the Church against the doctrines of Galileo and Darwin belongs. On the other hand, representatives of science have often made an attempt to arrive at fundamental judgments with respect to values and ends on the basis of scientific method, and in this way have set themselves in opposition to religion. These conflicts have all sprung from fatal errors.

Now, even though the realms of religion and science in themselves are clearly marked off from each other, nevertheless there exist between the two strong reciprocal relationships and dependencies. Though religion may be that which determines the goal, it has, nevertheless, learned from science, in the broadest sense, what means will contribute to the attainment of the goals it has set up. But science can only be created by those who are thoroughly imbued with the aspiration toward truth and understanding. This source of feeling, however, springs from the sphere of religion. To this there also belongs the faith in the possibility that the regulations valid for the world of existence are rational, that is, comprehensible to reason. I cannot conceive of a genuine scientist without that profound faith. The situation may be expressed by an image: science without religion is lame, religion without science is blind.

Though I have asserted above that in truth a legitimate conflict between religion and science cannot exist, I must nevertheless qualify this assertion once again on an essential point, with reference to the actual content of historical religions. This qualification has to do with the concept of God. During the youthful period of mankind's spiritual evolution human fantasy created gods in man's own image, who, by the operations of their will were supposed to determine, or at any rate to influence, the phenomenal world. Man sought to alter the disposition of these gods in his own favor by means of magic and prayer. The idea of God in the religions taught at present is a sublimation of that old concept of the gods. Its anthropomorphic character is shown, for instance, by the fact that men appeal to the Divine Being in prayers and plead for the fulfillment of their wishes.

Nobody, certainly, will deny that the idea of the existence of an omnipotent, just, and omnibeneficent personal God is able to accord man solace, help, and guidance; also, by virtue of its simplicity it is accessible to the most undeveloped mind. But, on the other hand, there are decisive weaknesses attached to this idea in itself, which have been painfully felt since the beginning of history. That is, if this being is omnipotent, then every occurrence, including every human action, every human thought, and every human feeling and aspiration is also His work; how is it possible to think of holding men responsible for their deeds and thoughts before such an almighty Being? In giving out punishment and rewards He would to a certain extent be passing judgment on Himself. How can this be combined with the goodness and righteousness ascribed to Him?

The main source of the present-day conflicts between the spheres of religion and of science lies in this concept of a personal God. It is the aim of science to establish general rules which determine the reciprocal connection of objects and events in time and space. For these rules, or laws of nature, absolutely general validity is required--not proven. It is mainly a program, and faith in the possibility of its accomplishment in principle is only founded on partial successes. But hardly anyone could be found who would deny these partial successes and ascribe them to human self-deception. The fact that on the basis of such laws we are able to predict the temporal behavior of phenomena in certain domains with great precision and certainty is deeply embedded in the consciousness of the modern man, even though he may have grasped very little of the contents of those laws. He need only consider that planetary courses within the solar system may be calculated in advance with great exactitude on the basis of a limited number of simple laws. In a similar way, though not with the same precision, it is possible to calculate in advance the mode of operation of an electric motor, a transmission system, or of a wireless apparatus, even when dealing with a novel development.

To be sure, when the number of factors coming into play in a phenomenological complex is too large, scientific method in most cases fails us. One need only think of the weather, in which case prediction even for a few days ahead is impossible. Nevertheless no one doubts that we are confronted with a causal connection whose causal components are in the main known to us. Occurrences in this domain are beyond the reach of exact prediction because of the variety of factors in operation, not because of any lack of order in nature.

We have penetrated far less deeply into the regularities obtaining within the realm of living things, but deeply enough nevertheless to sense at least the rule of fixed necessity. One need only think of the systematic order in heredity, and in the effect of poisons, as for instance alcohol, on the behavior of organic beings. What is still lacking here is a grasp of connections of profound generality, but not a knowledge of order in itself.

The more a man is imbued with the ordered regularity of all events the firmer becomes his conviction that there is no room left by the side of this ordered regularity for causes of a different nature. For him neither the rule of human nor the rule of divine will exists as an independent cause of natural events. To be sure, the doctrine of a personal God interfering with natural events could never be refuted, in the real sense, by science, for this doctrine can always take refuge in those domains in which scientific knowledge has not yet been able to set foot.

But I am persuaded that such behavior on the part of the representatives of religion would not only be unworthy but also fatal. For a doctrine which is able to maintain itself not in clear light but only in the dark, will of necessity lose its effect on mankind, with incalculable harm to human progress. In their struggle for the ethical good, teachers of religion must have the stature to give up the doctrine of a personal God, that is, give up that source of fear and hope which in the past placed such vast power in the hands of priests. In their labors they will have to avail themselves of those forces which are capable of cultivating the Good, the True, and the Beautiful in humanity itself. This is, to be sure, a more difficult but an incomparably more worthy task. (This thought is convincingly presented in Herbert Samuel's book, Belief and Action.) After religious teachers accomplish the refining process indicated they will surely recognize with joy that true religion has been ennobled and made more profound by scientific knowledge.

If it is one of the goals of religion to liberate mankind as far as possible from the bondage of egocentric cravings, desires, and fears, scientific reasoning can aid religion in yet another sense. Although it is true that it is the goal of science to discover rules which permit the association and foretelling of facts, this is not its only aim. It also seeks to reduce the connections discovered to the smallest possible number of mutually independent conceptual elements. It is in this striving after the rational unification of the manifold that it encounters its greatest successes, even though it is precisely this attempt which causes it to run the greatest risk of falling a prey to illusions. But whoever has undergone the intense experience of successful advances made in this domain is moved by profound reverence for the rationality made manifest in existence. By way of the understanding he achieves a far-reaching emancipation from the shackles of personal hopes and desires, and thereby attains that humble attitude of mind toward the grandeur of reason incarnate in existence, and which, in its profoundest depths, is inaccessible to man. This attitude, however, appears to me to be religious, in the highest sense of the word. And so it seems to me that science not only purifies the religious impulse of the dross of its anthropomorphism but also contributes to a religious spiritualization of our understanding of life.

The further the spiritual evolution of mankind advances, the more certain it seems to me that the path to genuine religiosity does not lie through the fear of life, and the fear of death, and blind faith, but through striving after rational knowledge. In this sense I believe that the priest must become a teacher if he wishes to do justice to his lofty educational mission.
(published by the Conference on Science, Philosophy and Religion in Their Relation to the Democratic Way of Life, Inc., New York, 1941.)

Tuesday, July 14, 2009


x1 = [1 2 3 4; 4 NaN 6 5; NaN 2 3 NaN]
[y1,ps] = fixunknowns(x1)


function [y]=ReplaceMissingValue(x,type,value)

% x:n*d;n denote the number of samples,d:denote dimenssion
% type:
% 1:replace the NaN with the value;
% 2:replace the NaN with the mean, in this case, value is options;
% when the type is default,it would be set 2;
%value: which use to replace the NaN
% name: 'fixunknowns'
%       xrows: 3
%       yrows: 5
%     unknown: [2 3]
%       known: 1
%       shift: [0 0 1]
%      xmeans: [3x1 double]

if nargin <2
if nargin <3 && type==1
    disp('pleaes assign a "value" using to replace');

if type==2



if type==1

weka:在filter->unsupervised->attribute->ReplaceMissingValues,其对缺失值处理只有一种方法,即modes and means.

而spss则要强大多了,其对缺失值的处理有好几种方法,Transform->Replace missing values;

matlab 和excel 也可以使用编程的方法或者其他处理方法来解决missing  value 的问题,但是相对来说还是没有那么的方便。

 方法很多,最为常用的方法之一是:z-score,即 means=0,std=1,这种方法spss,和matlab都有提供,spss如图所示:

matlab使用的函数为:mapstd(data,means,std),具体使用参考matlab (注意,data:d*n,d 表示维度,n表示样本数) .

还有一个方法即是归一化到一个区间,如[a,b],此种方法都很简单大多都基于以下原理: y = (ymax-ymin)*(x-xmin)/(xmax-xmin) + ymin;,在matlab中有函数:mapminmax(data),具体使用参考matlab (注意,data:d*n,d 表示维度,n表示样本数).


一个简单方法是:对于连续变量,可以使用z-score归一化数据,然后对 值大于3小于-3的单元所在的样本视为噪音。对于分类变量,由于每个值都有相应意义的,故对于偏离取值范围的值即可视为噪点。噪点可以剔除也可以将其作为缺失值处理。spss和excel就可以提供这样的简单处理。

Monday, July 13, 2009

web proxy访问我的blogspot

博客被封了,一直没有来维护,也一直挂着,最近找了一个还不错的web代理网站,居然可以访问被封已久的blogger,速度还是可以的。令人兴奋。the address is showed as follow:

Thursday, March 12, 2009

Multiple kernel learning's People

1.Gert Lanckriet----- Homepage

My research interests are on the interplay between machine learning, applied statistics and convex optimization techniques. I am interested in developing methods for pattern discovery from extremely large-scale data sets, taking uncertainty into account and respecting computational constraints. More precisely, my research focuses on the integration of multiple, heterogeneous data types for a variety of pattern discovery tasks where the amount of data is extremely large and the solutions desired to be sparse. An important challenge in the field of machine learning is to deal with the increasing amount of data that is available for learning and to leverage the (also increasing) diversity of information sources, describing these data. Beyond classical vectorial data formats, data in the format of graphs, trees, strings and beyond have become widely available for data mining, e.g., the linked structure of communication networks, amino acid sequences describing proteins, etc. Moreover, for interpretability, stability and economical reasons, decision rules that rely on a small subset of the information sources and/or a small subset of the features describing the data are highly desired: sparse learning algorithms are a must. My research is inspired by practical applications in computational genomics, financial engineering and computer music.

Lanckriet, G.R.G., Cristianini, N., Bartlett, P., El Ghaoui, L., Jordan, M.I. (2004).Learning the Kernel Matrix with Semidefinite Programming . Journal of Machine Learning Research, 5, 27-72, 2004.

2.Francis Bach-----homepage

I am a researcher at INRIA, working in the Willow project, which is located at Ecole Normale Superieure. I completed my Ph.D. in Computer Science at U.C. Berkeley, working with Professor Michael Jordan, and spent two years in the Mathematical Morphology group at Ecole des Mines de Paris. I am interested in statistical machine learning, and especially in graphical models, sparse methods, kernel-based learning, vision and signal processing. [CV (English)] [CV (French)]

F. Bach, G. R. G. Lanckriet, M. I. Jordan. Multiple Kernel Learning, Conic Duality, and the SMO AlgorithmProceedings of the Twenty-first International Conference on Machine Learning, 2004 [pdf] [tech-report]

3.Alain Rakotomamonjy----Homepage

Research Interests
Kernel Methods and Support Vector Machines Algorithm
Regularization paths Multiple Kernel
Kernel Design Sparsity and variable selection Wavelet and Time-Frequency Signal AnalysisSignal Classification Brain-Computer Interfaces  Object Recognition

A. Rakotomamonjy, F. Bach, Y. Grandvalet, S. Canu, SimpleMKL,  Journal of Machine Learning Research, Vol. 9, pp 2491-2521, 2008. [JMLR page][PDF] [code]

4.M. Gönen----Homepage

Research Interests
Support Vector MachinesKernel MethodsSimulation and Real-time Control of Flexible Manufacturing Systems

M. Gönen and E. Alpaydın (2008) ”Localized Multiple Kernel Learning”, In Proceedings of the 25th International Conference on Machine Learning, 352-359.

5.S. Sonnenburg----Homepage

I am currently a postdoc at the Machine Learning in Biology Group at the Friedrich Miescher Laboratory of the Max Planck Society in Tübingen. I have been working in the IDA group at the Fraunhofer Institute FIRST.

I am intrigued by sequence based machine learning methods involving large data sets and have developed several machine learning methods for bioinformatics applications such as splice site recognition, promoter detection and gene finding. I also worked on microarray analysis and motif discovery.


S. Sonnenburg, G. R¨atsch, C. Sch¨afer, and B. Sch¨olkopf. Large scale multiplekernel learning. Journal of Machine Learning Research, 7, 2006.