## Laplacian Eigenmap 的R code 和结果

$$L f = \lambda D f \quad st. \quad f^T D f = 1 \quad \mathrm{and} \quad f^T D \mathbf{1} = 0$$

【1】Mikhail Belkin and Partha Niyogi, “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation,” Neural Computation 15, no. 6 (February 6, 2011): 1373-1396.

【2】漫谈 Clustering (番外篇): Dimensionality Reduction http://blog.pluskid.org/?p=290

## Kernel PCA 原理和演示

1. Kernel Principal Component Analysis 的矩阵基础

1.1 传统的PCA如何做？

$$C=\frac{1}{N} x_i x_i^T = \frac{1}{N} X X^T$$

$$CU = U \Lambda \Rightarrow C = U \Lambda U^T = \sum_a \lambda_a u_a u_a^T$$

1.2 在高维空间里的PCA应该如何做？

$$\bar{C}=\frac{1}{N} \Phi(x_i) \Phi(x_i)^T = \frac{1}{N} \Phi(X) \Phi(X)^T$$

1.3 如何用Kernel Trick在高维空间做PCA？

$$C u_a = \lambda_a u_a$$
\begin{align} u_a &= \frac{1}{\lambda_a} C u \\ &= \frac{1}{\lambda_a} (\sum_i x_i x_i^T )u \\ &= \frac{1}{\lambda_a} \sum_i x_i (x_i^T u) \\ &= \frac{1}{\lambda_a} \sum_i (x_i^T u) x_i \\ &= \sum_i \frac{x_i^T u}{\lambda_a} x_i \\ &= \sum_i \alpha_i^a x_i \end{align}

\begin{align} x_i^T C u_a &= \lambda_a x_i^T u_a \\ x_i^T \frac{1}{N} \sum_j x_j x_j^T \sum_k \alpha_k^a x_k &= \lambda_a x_i^T \sum_k \alpha_k^a x_k \\ \sum_j \alpha_k^a \sum_k (x_i^T x_j) (x_j^T x_k) &= N \lambda_a \sum_k \alpha_k^a (x_i^T x_k) \\ \end{align}

（这里$$\alpha^a$$定义为$$[\alpha_1^a, \alpha_2^a, \ldots, \alpha_N^a]^T$$.）

$$K \alpha = \tilde{\lambda}_a \alpha \quad \mathrm{with} \quad \tilde{\lambda}_a = N \lambda_a$$
$$K$$矩阵包含特征值$$\tilde{\lambda}$$和$$\alpha^a$$，我们可以通过$$\alpha$$可以计算得到$$u_a$$，

\begin{align} 1 &= u_a^T u_a \\ &= ( \sum_i \alpha_i^a x_i) ^T (\sum_j \alpha_j^a x_j ) \\ &= \sum_i \sum_j \alpha_i^a \alpha_j^a x_i^T x_j^T \\ &=(\alpha^a)^T K \alpha_a \\ &=(\alpha^a)^T ( N \lambda_a \alpha_a) \\ &=N \lambda_a ({\alpha^a}^T \alpha_a )\\ &\Rightarrow \quad \lVert \alpha^a \rVert = 1/\sqrt{N \lambda_a} = 1/\sqrt{\tilde{\lambda}_a} \end{align}

1.4 如何在主成分方向上投影？

$$u_a^T t = \sum_i \alpha_i^a x_i^T t = \sum_i \alpha_i^a ( x_i^T t)$$

$$u_a^T t = \sum_i \alpha_i^a K(x_i, t)$$

1.5 如何Centering 高维空间的数据？

\begin{align} K_{ij}^C &= <\Phi_i^C \Phi_j^C> \\ &= (\Phi_i – \frac{1}{N}\sum_k \Phi_k)^T (\Phi_j – \frac{1}{N}\sum_l \Phi_l ） \\ &=\Phi_i^T\Phi_j – \frac{1}{N}\sum_l \Phi_i^T \Phi_l – \frac{1}{N}\sum_k \Phi_k^T \Phi_j + \frac{1}{N^2}\sum_k \sum_l \Phi_k^T \Phi_l \\ &=K_{ij} – \frac{1}{N}\sum_l K_{il} – \frac{1}{N}\sum_k K_{kj} + \frac{1}{N^2}\sum_k \sum_l K_{kl} \end{align}

$$K^C = K – 1_N K – K 1_N + 1_N K 1_N$$

\begin{align} K(x_i, t)^C &= <\Phi_i^C \Phi_t^C> \\ &= (\Phi_i – \frac{1}{N}\sum_k \Phi_k)^T (\Phi_t – \frac{1}{N}\sum_l \Phi_l ） \\ &=\Phi_i^T\Phi_t – \frac{1}{N}\sum_l \Phi_i^T \Phi_l – \frac{1}{N}\sum_k \Phi_k^T \Phi_t + \frac{1}{N^2}\sum_k \sum_l \Phi_k^T \Phi_l \\ &=K(x_i,t) – \frac{1}{N}\sum_l K_{il} – \frac{1}{N}\sum_k K(x_k,t) + \frac{1}{N^2}\sum_k \sum_l K_{kl} \end{align}

2. 演示 (R code)

KPCA图片：

R 源代码（Source Code）：链接到完整的代码 KernelPCA

Kernel PCA部分代码：

# Kernel PCA
# Polynomial Kernel
# k(x,y) = t(x) %*% y + 1
k1 = function (x,y) { (x[1] * y[1] + x[2] * y[2] + 1)^2 }
K = matrix(0, ncol = N_total, nrow = N_total)
for (i in 1:N_total) {
for (j in 1:N_total) {
K[i,j] = k1(X[i,], X[j,])
}}
ones = 1/N_total* matrix(1, N_total, N_total)
K_norm = K - ones %*% K - K %*% ones + ones %*% K %*% ones
res = eigen(K_norm)

V = res$vectors D = diag(res$values)

rank = 0
for (i in 1:N_total) {
if (D[i,i] < 1e-6) { break }
V[,i] = V[,i] / sqrt (D[i,i])
rank = rank + 1
}
Y = K_norm %*%  V[,1:rank]
plot(Y[,1], Y[,2], col = rainbow(3)[label], main = "Kernel PCA (Poly)"
, xlab="First component", ylab="Second component")


3. 主要参考资料

【1】A Tutorial on Principal Component Analysis ,Jonathon Shlens, Shlens03

【2】Wikipedia： http://en.wikipedia.org/wiki/Kernel_principal_component_analysis

【3】 Original KPCA Paper：Kernel principal component analysis，Bernhard Schölkopf, Alexander Smola and Klaus-Robert Müller http://www.springerlink.com/content/w0t1756772h41872/fulltext.pdf

【4】Max Wellings’s classes notes for machine learning Kernel Principal Component Analaysis http://www.ics.uci.edu/~welling/classnotes/papers_class/Kernel-PCA.pdf

## ubuntu 中VPN 的使用

1. 建立profile

2. 连接VPN
sudo pon <TUNNEL>

3.更改route table
sudo route add default dev ppp0

4. 更改DNS服务器
sudo vi /etc/resolv.conf

nameserver 208.67.222.222
nameserver 208.67.220.220

5. 检查

## OpenMP使用经验

（2）2.2：  _OPEN 这个macro 被定义成yyyymm形式，表示OpenMP API的版本

（4） worksharing construct：有4类：loop；sections constructs;single construct；workshare construct

loop：在C中紧接着一个for循环

section：与loop类似，不必要是for循环，只要是structure block就行

（5）2.6节讲了结合parallel construct 和worksharing construct，就是这两个construct可以合在一起用。然后分3个小节介绍了parallel loop construct （相当于loop construct 后直接用parallel construct），parallel section construct（相当于section construct 后直接用parallel construct）和parallel workshare construct（相当于worshare construct 后直接用parallel construct）。

construct里变量的数据共享属性（Data-sharing Attribute）：提前决定的（private：用threadpriviate声明的，在construct里声明的，for construct里的循环变量；shared：在heap上的，static的变量），显示决定的（在construct上指明的），隐示决定的（default clause可以指定的；如果default clause没有指定，则比较复杂，例如parallel construct中是shared，全部规则见79页）。额外的不能由上面隐式规则推出的可以见92页。 （我认为如果数据共享属性已经复杂到不好看出，那是不是这个程序本身写的太不清晰了！）

（9）第3章是运行库里的子程序

3.2 控制执行环境的函数，包括设置/取得线程数，得到最多的支持的线程数，设置线程数的上限等等。

3.3 Lock程序，这是为了给线程加锁而提供的函数，分简单锁（simple lock）和级联锁（nested lock，区别是可以set多次）

3.4 时间程序。只有两个：omp_get_wtime() 返回double型的时间 和omp_get_wtick()返回1秒等于多少个时钟的tick

（11）第5章有各种各样的样例程序。这样当我们不清楚概念的时候，都可以快速的查看，例如如何使用lock，如何用reduction……

【1】OpenMP Specification Version 3.0 Complete Specifications – (May, 2008). (PDF)

【2】OpenMP C/C++ Summary Card http://www.openmp.org/mp-documents/OpenMP3.0-SummarySpec.pdf

【3】Wikipedia （其中介绍OpenMP语言架构的图很不错）http://en.wikipedia.org/wiki/OpenMP

## How to set up BuyVM with LAMP, WordPress and VPN

1. LAMP的安装和优化

StartServers          1
MinSpareServers       1
MaxSpareServers       5
ServerLimit          16
MaxClients           16
MaxRequestsPerChild   0
ListenBacklog        100

[mysqld]
user            = mysql
port            = 3306
socket          = /var/run/mysqld/mysqld.sock
skip-locking
key_buffer_size = 1M
max_allowed_packet = 1M
table_open_cache = 10
sort_buffer_size = 64K
net_buffer_length = 2K
skip-innodb
# Don't listen on a TCP/IP port at all. This can be a security enhancement,
# if all processes that need to connect to mysqld run on the same host.
# All interaction with mysqld must be made via Unix sockets or named pipes.
# Note that using this option without enabling named pipes on Windows
# (using the "enable-named-pipe" option) will render mysqld useless!
#
#skip-networking
server-id       = 1
# Uncomment the following if you want to log updates
#log-bin=mysql-bin
# binary logging format - mixed recommended
#binlog_format=mixed
# Uncomment the following if you are using InnoDB tables
#innodb_data_home_dir = /var/lib/mysql/
#innodb_data_file_path = ibdata1:10M:autoextend
#innodb_log_group_home_dir = /var/lib/mysql/
# You can set .._buffer_pool_size up to 50 - 80 %
# of RAM but beware of setting memory usage too high
#innodb_buffer_pool_size = 16M
# Set .._log_file_size to 25 % of buffer pool size
#innodb_log_file_size = 5M
#innodb_log_buffer_size = 8M
#innodb_flush_log_at_trx_commit = 1
#innodb_lock_wait_timeout = 50

2. WordPress的安装

(1) 备份旧系统的blog文件夹和数据库；

(2) 拷贝这两样并安装到新域名下；

(3) 在新域名下激活系统（就是访问一下，结果登录的时候被转回旧系统）；

(4) 到旧系统中在Setting里把主机（domain）改成新的域名；

(5) 把旧系统的blog文件夹和数据库再次拷贝到新的域名下；

(6) 在新的域名下登录，这回就应该没问题了！

3. VPN的安装

（2）Akismet ：防止无聊的Spammer

（3）NextGEN Gallery：提供一个展示自己图片的方式

（5）MathJAX：为Wordpress提供LaTeX语法支持，方便今后输入和显示数学公式

EDIT:

http://www.matrix67.com/blog/archives/2660

## Uninitialized variable makes a mysterious bug

Reason:

I used strtod() function in C, forgot to set errno to 0, then after calling strtod(), the value of errno is unpredictable.

When found the bug:

I tried to use Ptyhon ctypes module. In script mode, python pyCtypes.py always crash but in command line mode, the code becomes all right. It’s mysterious running Python in different ways turns out to give different results.

First, I thought it is clear that Python has some bug, otherwise the same Python code should give the same result.

Then I realized my Python code use a DLL routine using C language, and I recalled in the man page of “strtod”, it says we need to initialize errno value to zero every time before calling.

Example code:
Line 1 should be added to ensure correctness.

        errno = 0;
vec->value[vec->len++] = strtod(temp, NULL);
//vector_print(vec);
if (errno != 0) {
perror("strtod");
fprintf(stderr, "%s\n", temp);
exit(EXIT_FAILURE);
}


Note:

WordPress supports syntax highlight.

I am using SyntaxHighlighter Evolved http://wordpress.org/extend/plugins/syntaxhighlighter/.

Official documentation mentions another way: http://en.support.wordpress.com/code/posting-source-code/.

## NextGEN Gallery

I tried to show 10 pictures using NextGEN gallery gallery, as following:

Another useful tag is to use “imagebrowser”, it will show bigger pictures with their titles, total picture counts.

Note, there are no spaces after [ and before ].

Detail documentation: http://wordpress.org/extend/plugins/nextgen-gallery/faq/