Table of Contents
Hidden Markov Model (隐马尔科夫模型)定义基本问题前向算法 算法流程实现代码 后向算法 算法流程实现代码 Viterbi算法 算法流程实现代码 Baum-Welch 算法 单观测序列多观测序列实现代码(单观测序列) 符号总结hmmlearn 安装使用 常见错误 GitHub参考资料Hidden Markov Model (隐马尔科夫模型)
Back to TOC
两种问题特征:
基于序列的,比如时间序列,或者状态序列两类数据,一类序列数据是可以观测到的,即观测序列;而另一类数据是不能观察到的,即隐藏状态序列,简称状态序列
定义
Back to TOC
假设 N N N是可能的隐藏状态数, M M M是可能的观测状态数,定义
Q = { q 1 , q 2 , … , q N } , V = { v 1 , v 2 , … , v M } \mathcal{Q}=\{q_1,q_2,\dots,q_N\},\mathcal{V}=\{v_1,v_2,\dots,v_M\} Q={q1,q2,…,qN},V={v1,v2,…,vM}
分别为所有可能的隐藏状态和所有可能的观测状态的集合
同时,对于一个长度为 T T T的序列 I I I,和对应的观测序列 O O O
I = { s 1 , s 2 , … , s T } , O = { o 1 , o 2 , … , o T } \mathcal{I}=\{s_1,s_2,\dots,s_T\},\mathcal{O}=\{o_1,o_2,\dots,o_T\} I={s1,s2,…,sT},O={o1,o2,…,oT}
HMM做了两个很重要的假设:
齐次马尔科夫链假设。任意时刻隐藏状态只依赖于它前一个隐藏状态
定义状态转移概率 A i j A_{ij} Aij为从当前时刻 t t t的状态 s i s_i si转移到下一时刻 t + 1 t+1 t+1的状态 s j s_j sj的概率,即
A i j = P ( s t + 1 = q j ∣ s t = q i ) A_{ij}=P(s_{t+1}=q_j|s_t=q_i) Aij=P(st+1=qj∣st=qi)
从而定义状态转移矩阵 A ∈ R N × N A\in \mathbb{R}^{N\times N} A∈RN×N观测独立性假设即任意时刻的观测状态只仅仅依赖于当前时刻的隐藏状态。定义生成概率 B i j B_{ij} Bij为由隐藏状态 s i s_i si生成观测状态 q j q_j qj的概率,即
B i j = P ( o t = v i ∣ s t = q j ) B_{ij}=P(o_t=v_i|s_t=q_j) Bij=P(ot=vi∣st=qj)
从而定义生成概率矩阵(发射矩阵) B ∈ R N × M B\in \mathbb{R}^{N\times M} B∈RN×M
最后,定义在 t t t时刻的隐藏状态分布 Π t = [ π t ( k ) ] \Pi_t=[\pi_t (k)] Πt=[πt(k)],其中 π t ( k ) = P ( s t = q k ) \pi_t (k)=P(s_t=q_k) πt(k)=P(st=qk)
因此一个HMM模型主要由三个参数表示:
λ = ( A , B , Π ) \lambda=(A,B,\Pi) λ=(A,B,Π)
基本问题
Back to TOC
评估观察序列概率。给定模型 λ \lambda λ和观测序列 O \mathcal{O} O,计算在模型 λ \lambda λ下该观测序列 O \mathcal{O} O出现的概率 P ( O ∣ λ ) P(\mathcal{O}|\lambda) P(O∣λ)。求解方法:前向后向算法 预测问题。给定观测序列 O = { o 1 , o 2 , … , o T } \mathcal{O}=\{o_1,o_2,\dots,o_T\} O={o1,o2,…,oT}和模型参数 λ = ( A , B , Π ) \lambda=(A,B,\Pi) λ=(A,B,Π),求解最有可能出现的隐藏状态序列。求解方法:Viterbi算法 模型参数学习问题。给定观测序列 O = { o 1 , o 2 , … , o T } \mathcal{O}=\{o_1,o_2,\dots,o_T\} O={o1,o2,…,oT},求解模型参数 λ = ( A , B , Π ) \lambda=(A,B,\Pi) λ=(A,B,Π)使得 P ( O ∣ λ ) P(\mathcal{O}|\lambda) P(O∣λ)最大。求解方法:Baum-Walch算法(EM算法)
前向算法
Back to TOC
算法流程
输入:观测序列 O = { o 1 , o 2 , … , o T } \mathcal{O}=\{o_1,o_2,\dots,o_T\} O={o1,o2,…,oT},模型参数 λ = ( A , B , Π ) \lambda=(A,B,\Pi) λ=(A,B,Π)
输出:观测序列 P ( O ∣ λ ) P(O|\lambda) P(O∣λ)
步骤:
计算时刻1各个隐藏状态 s i s_i si的前向概率
α 1 ( i ) = π ( i ) B i , o 1 , i = 1 , 2 , … , N \alpha_1(i)=\pi(i)B_{i,o_1},i=1,2,\dots,N α1(i)=π(i)Bi,o1,i=1,2,…,N递推 2 , 3 , … , T 2,3,\dots,T 2,3,…,T时刻的前向概率
α t + 1 ( i ) = [ ∑ j = 1 N α t ( j ) A j i ] B i , o t + 1 , i = 1 , 2 , … , N \alpha_{t+1}(i)=[\sum_{j=1}^{N}\alpha_{t}(j)A_{ji}]B_{i,o_{t+1}},i=1,2,\dots,N αt+1(i)=[j=1∑Nαt(j)Aji]Bi,ot+1,i=1,2,…,N最终结果
P ( O ∣ λ ) = ∑ i N α T ( i ) P(\mathcal{O}|\lambda)=\sum_i^N\alpha_T(i) P(O∣λ)=i∑NαT(i)
实现代码
def HMMfwd(pi, a, b, obs):'''pi:初始概率分布a:状态转移矩阵b:发射矩阵obs:观测序列'''nStates = np.shape(b)[0]T = np.shape(obs)[0]alpha = np.zeros((nStates,T))'''alpha[i,t]表示上述公式的 alpha_t(i)'''alpha[:,0] = pi*b[:,obs[0]]for t in range(1,T):for s in range(nStates):alpha[s,t] = b[s,obs[t]] * np.sum(alpha[:,t-1] * a[:,s])return alpha
最后计算 P ( O ∣ λ ) P(\mathcal{O}|\lambda) P(O∣λ)
np.sum(alpha[:,-1])
后向算法
Back to TOC
算法流程
输入:观测序列 O = { o 1 , o 2 , … , o T } \mathcal{O}=\{o_1,o_2,\dots,o_T\} O={o1,o2,…,oT},模型参数 λ = ( A , B , Π ) \lambda=(A,B,\Pi) λ=(A,B,Π)
输出:观测序列 P ( O ∣ λ ) P(O|\lambda) P(O∣λ)
步骤:
初始化时刻 T T T各个隐藏状态 s i s_i si的前向概率
β T ( i ) = 1 , i = 1 , 2 , … , N \beta_T(i)=1, i=1,2,\dots,N βT(i)=1,i=1,2,…,N递推 T − 1 , T − 2 , … , 1 T-1,T-2,\dots,1 T−1,T−2,…,1时刻的后向概率
β t ( i ) = ∑ i = 1 N A i j B j , o t + 1 β t + 1 ( j ) , i = 1 , 2 , … , N \beta_{t}(i)=\sum_{i=1}^{N}A_{ij}B_{j,o_{t+1}}\beta_{t+1}(j),i=1,2,\dots,N βt(i)=i=1∑NAijBj,ot+1βt+1(j),i=1,2,…,N最终结果
P ( O ∣ λ ) = ∑ i N π i B i , o 1 β 1 ( i ) P(\mathcal{O}|\lambda)=\sum_i^N\pi_iB_{i,o_1}\beta_1(i) P(O∣λ)=i∑NπiBi,o1β1(i)
实现代码
def HMMbwd(a, b, obs):'''a:状态转移矩阵b:发射矩阵obs:观测序列'''nStates = np.shape(b)[0]T = np.shape(obs)[0]'''beta[i,t]表示上述公式的 beta_t(i)'''beta = np.zeros((nStates,T))beta[:,-1] = 1.0 for t in range(T-2,-1,-1):for s in range(nStates):beta[s,t] = np.sum(a[s,:] * b[:,obs[t+1]] * beta[:,t+1] )return beta
最后计算 P ( O ∣ λ ) P(\mathcal{O}|\lambda) P(O∣λ)
np.sum(pi*b[:,obs[0]]*beta[:,0])
Viterbi算法
Back to TOC
算法流程
输入:观测序列 O = { o 1 , o 2 , … , o T } \mathcal{O}=\{o_1,o_2,\dots,o_T\} O={o1,o2,…,oT},模型参数 λ = ( A , B , Π ) \lambda=(A,B,\Pi) λ=(A,B,Π)
输出:最有可能的有隐状态序列 I ∗ = { i 1 ∗ , i 2 ∗ , … , i T ∗ } \mathcal{I}^*=\{i^{*}_1,i^{*}_2,\dots,i^{*}_T\} I∗={i1∗,i2∗,…,iT∗}
步骤:
初始化局部状态
δ 1 ( i ) = π i B i , o 1 , Φ 1 ( i ) = 0 , i = 1 , 2 , … , N \delta_1(i)=\pi_iB_{i,o_1},\quad \Phi_1(i)=0,\quad i=1,2,\dots,N δ1(i)=πiBi,o1,Φ1(i)=0,i=1,2,…,N进行动态规划递推时刻 t = 2 , 3 , … , T t=2,3,\dots,T t=2,3,…,T的局部状态
δ t ( i ) = max 1 ≤ j ≤ N [ δ t − 1 ( j ) A j , i ] B i , o t , i = 1 , 2 , … , N Φ t ( i ) = arg max 1 ≤ j ≤ N [ δ t − 1 ( j ) A j , i ] , i = 1 , 2 , … , N \begin{array}{c} {\delta_t(i)=\max_{1\leq j\leq N}[\delta_{t-1}(j)A_{j,i}]B_{i,o_t},\quad i=1,2,\dots,N} \\ \\ {\Phi_t(i)=\arg \max_{1\leq j\leq N}[\delta_{t-1}(j)A_{j,i}],\quad i=1,2,\dots,N} \end{array} δt(i)=max1≤j≤N[δt−1(j)Aj,i]Bi,ot,i=1,2,…,NΦt(i)=argmax1≤j≤N[δt−1(j)Aj,i],i=1,2,…,N计算时刻 T T T最大的 δ T ( i ) \delta_T(i) δT(i)即为最可能隐藏状态序列出现的概率,计算时刻 T T T最大的 Φ T ( i ) \Phi_T(i) ΦT(i),即为最可能的隐藏状态
i T ∗ = arg max 1 ≤ j ≤ N δ T ( i ) i^*_T=\arg \max_{1\leq j\leq N}\delta_{T}(i) iT∗=arg1≤j≤NmaxδT(i)利用局部状态 Φ ( i ) \Phi(i) Φ(i)开始回溯,对于时刻 t = T − 1 , T − 2 , … , 1 t=T-1,T-2,\dots,1 t=T−1,T−2,…,1:
i t ∗ = Φ t + 1 ( i t + 1 ∗ ) i^*_t=\Phi_{t+1}(i^*_{t+1}) it∗=Φt+1(it+1∗)最终得到最有可能的隐藏状态序列 I ∗ = { i 1 ∗ , i 2 ∗ , … , i T ∗ } \mathcal{I}^*=\{i^{*}_1,i^{*}_2,\dots,i^{*}_T\} I∗={i1∗,i2∗,…,iT∗}
实现代码
def Viterbi(pi, a, b, obs):'''pi:初始概率分布a:状态转移矩阵b:发射矩阵obs:观测序列'''nStates = np.shape(b)[0]T = np.shape(obs)[0]path = np.zeros(T, dtype=np.int32)delta = np.zeros((nStates,T))phi = np.zeros((nStates,T))delta[:,0] = pi * b[:,obs[0]]phi[:,0] = 0for t in range(1,T):for s in range(nStates):delta[s,t] = np.max(delta[:,t-1]*a[:,s])*b[s,obs[t]]phi[s,t] = np.argmax(delta[:,t-1]*a[:,s])path[-1] = np.argmax(delta[:,-1])for t in range(T-2,-1,-1):path[t] = phi[path[t+1],t+1]return path
最后解码得到的序列即为path
Baum-Welch 算法
Back to TOC
单观测序列
输入: 1 1 1个观测序列样本 O = { o 1 , o 2 , … , o T } \mathcal{O}=\{o_1,o_2,\dots,o_T\} O={o1,o2,…,oT}
输出:模型参数 λ = ( A , B , Π ) \lambda=(A,B,\Pi) λ=(A,B,Π)
步骤:
随机初始化或手动初始化 A , B , Π A,B,\Pi A,B,ΠE 步,求解两个中间变量 γ t ( i ) , ξ t ( i , j ) \gamma_{t}(i), \xi_{t}(i, j) γt(i),ξt(i,j),两者含义如下 γ t ( i ) \gamma_{t}(i) γt(i):给定模型 λ \lambda λ和观测序列 O \mathcal{O} O,在时刻 t t t的隐含状态为 q i q_i qi的概率,即 γ t ( i ) = P ( s t = q i ∣ O , λ ) \gamma_{t}(i)=P(s_{t}=q_{i} | O, \lambda) γt(i)=P(st=qi∣O,λ) ξ t ( i , j ) \xi_{t}(i, j) ξt(i,j):给定模型 λ \lambda λ和观测序列 O \mathcal{O} O,在时刻 t t t的隐含状态为 q i q_i qi,时刻 t + 1 t+1 t+1的隐含状态为 q j q_j qj的概率,即 ξ t ( i , j ) = P ( s t = q i , s t + 1 = q j ∣ O , λ ) \xi_{t}(i, j)=P(s_{t}=q_{i}, s_{t+1}=q_{j} | O, \lambda) ξt(i,j)=P(st=qi,st+1=qj∣O,λ)
结合前面的前向概率和后向概率的定义,计算这两个中间变量的公式如下:
γ t ( i ) = α t ( i ) β t ( i ) ∑ j = 1 N α t ( j ) β t ( j ) ξ t ( i , j ) = α t ( i ) A i j B j , o t + 1 β t + 1 ( j ) ∑ p = 1 N ∑ q = 1 N α t ( p ) A p q B q , o t + 1 β t + 1 ( q ) \begin{array}{c}{\gamma_{t}(i)=\frac{\alpha_{t}(i) \beta_{t}(i)}{\sum_{j=1}^{N} \alpha_{t}(j) \beta_{t}(j)}} \\ \\ {\xi_{t}(i, j)=\frac{\alpha_{t}(i) A_{i j} B_{j,o_{t+1}} \beta_{t+1}(j)}{\sum_{p=1}^{N} \sum_{q=1}^{N} \alpha_{t}(p) A_{pq} B_{q,o_{t+1}} \beta_{t+1}(q)}}\end{array} γt(i)=∑j=1Nαt(j)βt(j)αt(i)βt(i)ξt(i,j)=∑p=1N∑q=1Nαt(p)ApqBq,ot+1βt+1(q)αt(i)AijBj,ot+1βt+1(j) M 步,通过 E 步求解出的两个中间变量来求解模型参数,求解公式如下:
A i j = ∑ t = 1 T − 1 ξ t ( i , j ) ∑ t = 1 T − 1 γ t ( i ) B i j = ∑ t = 1 T γ t ( i ) I ( o t = v j ) ∑ t = 1 T γ t ( i ) π i = γ 1 ( i ) \begin{array}{c} {A_{i j}=\frac{\sum_{t=1}^{T-1} \xi_{t}(i, j)}{\sum_{t=1}^{T-1} \gamma_{t}(i)}} \\ \\ {B_{ij}=\frac{\sum_{t=1}^{T} \gamma_{t}(i) I\left(o_{t}=v_{j}\right)}{\sum_{t=1}^{T} \gamma_{t}(i)}} \\ \\ {\pi_{i}=\gamma_{1}(i)} \end{array} Aij=∑t=1T−1γt(i)∑t=1T−1ξt(i,j)Bij=∑t=1Tγt(i)∑t=1Tγt(i)I(ot=vj)πi=γ1(i)
上式中的 I ( o t = v j ) I(o_{t}=v_{j}) I(ot=vj)表示当时刻 t t t观测状态为 v k v_k vk 时, I ( o t = v j ) = 1 I(o_{t}=v_{j})=1 I(ot=vj)=1,否则 I ( o t = v j ) = 0 I(o_{t} = v_{j})=0 I(ot=vj)=0
多观测序列
输入: D D D个观测序列样本 { O 1 , O 2 , … , O D } \{\mathcal{O}_1,\mathcal{O}_2,\dots,\mathcal{O}_D\} {O1,O2,…,OD},其中 O d = { o 1 , o 2 , … , o T } , d = 1 , 2 , … , D \mathcal{O}_d=\{o_1,o_2,\dots,o_T\},d=1,2,\dots,D Od={o1,o2,…,oT},d=1,2,…,D
输出:模型参数 λ = ( A , B , Π ) \lambda =(A,B,\Pi) λ=(A,B,Π)
步骤:
随机初始化或手动初始化 A , B , Π A,B,\Pi A,B,ΠE 步,为每个序列求解两个中间变量 γ t ( d ) ( i ) , ξ t ( d ) ( i , j ) \gamma^{(d)}_{t}(i), \xi^{(d)}_{t}(i, j) γt(d)(i),ξt(d)(i,j)
类似单观测序列,只不过每个序列各自用前向后向算法求出各自的 α ( d ) \alpha^{(d)} α(d)和 β ( d ) \beta^{(d)} β(d),求解公式如下:
γ t ( d ) ( i ) = α t ( d ) ( i ) β t ( d ) ( i ) ∑ j = 1 N α t ( d ) ( j ) β t ( d ) ( j ) ξ t ( d ) ( i , j ) = α t ( d ) ( i ) A i j B j , o t + 1 β t + 1 ( d ) ( j ) ∑ p = 1 N ∑ q = 1 N α t ( d ) ( p ) A p q B q , o t + 1 β t + 1 ( d ) ( q ) \begin{array}{c}{\gamma^{(d)}_{t}(i)=\frac{\alpha^{(d)}_{t}(i) \beta^{(d)}_{t}(i)}{\sum_{j=1}^{N} \alpha^{(d)}_{t}(j) \beta^{(d)}_{t}(j)}} \\ \\ {\xi^{(d)}_{t}(i, j)=\frac{\alpha^{(d)}_{t}(i) A_{i j} B_{j,o_{t+1}} \beta^{(d)}_{t+1}(j)}{\sum_{p=1}^{N} \sum_{q=1}^{N} \alpha^{(d)}_{t}(p) A_{pq} B_{q,o_{t+1}} \beta^{(d)}_{t+1}(q)}}\end{array} γt(d)(i)=∑j=1Nαt(d)(j)βt(d)(j)αt(d)(i)βt(d)(i)ξt(d)(i,j)=∑p=1N∑q=1Nαt(d)(p)ApqBq,ot+1βt+1(d)(q)αt(d)(i)AijBj,ot+1βt+1(d)(j)M 步,通过 E 步求解出的两个中间变量来求解模型参数,同样类似单观测序列,只不过更新参数时需要考虑所有序列(即外层加上一个求和复方蒿),求解公式如下:
A i j = ∑ d = 1 D ∑ t = 1 T − 1 ξ t ( d ) ( i , j ) ∑ d = 1 D ∑ t = 1 T − 1 γ t ( d ) ( i ) B i j = ∑ d = 1 D ∑ t = 1 T γ t ( d ) ( i ) I ( o t = v j ) ∑ d = 1 D ∑ t = 1 T γ t ( d ) ( i ) π i = ∑ d = 1 D γ 1 ( d ) ( i ) D \begin{array}{c} {A_{i j}=\frac{\sum_{d=1}^D \sum_{t=1}^{T-1} \xi^{(d)}_{t}(i, j)}{\sum_{d=1}^D \sum_{t=1}^{T-1} \gamma^{(d)}_{t}(i)}} \\ \\ {B_{ij}=\frac{\sum_{d=1}^D \sum_{t=1}^{T} \gamma^{(d)}_{t}(i) I\left(o_{t}=v_{j}\right)}{\sum_{d=1}^D \sum_{t=1}^{T} \gamma^{(d)}_{t}(i)}} \\ \\ {\pi_{i}=\frac{\sum_{d=1}^D \gamma^{(d)}_{1}(i)}{D}} \end{array} Aij=∑d=1D∑t=1T−1γt(d)(i)∑d=1D∑t=1T−1ξt(d)(i,j)Bij=∑d=1D∑t=1Tγt(d)(i)∑d=1D∑t=1Tγt(d)(i)I(ot=vj)πi=D∑d=1Dγ1(d)(i)
同样的,上式中的 I ( o t = v j ) I(o_{t}=v_{j}) I(ot=vj)表示当时刻 t t t观测状态为 v k v_k vk 时, I ( o t = v j ) = 1 I(o_{t}=v_{j})=1 I(ot=vj)=1,否则 I ( o t = v j ) = 0 I(o_{t} = v_{j})=0 I(ot=vj)=0
实现代码(单观测序列)
def BaumWelch(obs, n_states, n_obs, pi=None, a=None, b=None, tol=1e-2, n_iter=10):T = np.shape(obs)[0]# initializeif not pi:pi = np.ones((n_states))/n_statesif not a:a = np.random.rand(n_states, n_states)if not b:b = np.random.rand(n_states, n_obs)xi = np.zeros((n_states, n_states, T))nits = 0while True:nits += 1old_a = a.copy()old_b = b.copy()alpha = HMMfwd(pi, a, b, obs)beta = HMMbwd(a, b, obs)gamma = alpha*betagamma /= gamma.sum(0)# E-stepfor t in range(T-1):for i in range(n_states):for j in range(n_states):xi[i, j, t] = alpha[i,t]*beta[j,t+1]*a[i,j]*b[j,obs[t+1]]xi[:,:,t] /= xi[:,:,t].sum()# The last step has no b, beta infor i in range(n_states):for j in range(n_states):xi[i,j,-1] = alpha[i,-1]*a[i,j]xi[:,:,-1] /= xi[:,:,-1].sum()# M-stepfor i in range(n_states):for j in range(n_states):a[i,j] = xi[i,j,:-1].sum()/gamma[i,:-1].sum()a /= a.sum(1, keepdims=True)for i in range(n_states):for j in range(n_obs):found = (obs==j).nonzero()[0]b[i,j] = gamma[i,found].sum()/gamma[i].sum()b /= b.sum(1, keepdims=True)pi = gamma[:, 0]if np.linalg.norm(a - old_a) < tol and np.linalg.norm(b - old_b)< tol or nits > n_iter:breakreturn pi, a, b
符号总结
Back to TOC
hmmlearn
Back to TOC
hmmlearn 是一个用于隐马尔可夫模型无监督学习和推理的算法库,其提供了易于使用的类似于sklearn的API。
安装
pip install hmmlearn
注意:前提是要有gcc编译器,因为部分源代码由C编译而成。
当前版本:V 0.2.2
使用
这里只介绍通用的离散HMM模型的使用(序列是离散的)
建立一个HMM模型
from hmmlearn import hmmmodel = hmm.MultinomialHMM(n_components=n_states, tol=1e-2, n_iter=10, init_params='')model.startprob_ = init_pimodel.transmat_ = init_Tmodel.emissionprob_ = init_E
如果是自己初始化的话,init_params=''
,否则模型随机初始化(默认)init_params='ste'
代表初始化三个参数,并且需要手动设定model.n_features = n_obs
hmm.MultinomialHMM
相关参数
n_components
: int
Number of states. (隐状态数)
startprob_prior
: array, shape (n_components, ), optional
Parameters of the Dirichlet prior distribution for
:attr:startprob_
.
transmat_prior
: array, shape (n_components, n_components), optional
Parameters of the Dirichlet prior distribution for each row
of the transition probabilities :attr:transmat_
.
algorithm
: string, optional
Decoder algorithm. Must be one of “viterbi” or “map”.
Defaults to “viterbi”.
random_state
: RandomState or an int seed, optional
A random number generator instance.
n_iter
: int, optional
Maximum number of iterations to perform.
tol
: float, optional
Convergence threshold. EM will stop if the gain in log-likelihood
is below this value.
verbose
: bool, optional
WhenTrue
per-iteration convergence reports are printed
to :data:sys.stderr
. You can diagnose convergence via the
:attr:monitor_
attribute.
params
: string, optional
Controls which parameters are updated in the training
process. Can contain any combination of ‘s’ for startprob,
‘t’ for transmat, ‘e’ for emissionprob.
Defaults to all parameters.
init_params
: string, optional
Controls which parameters are initialized prior to
training. Can contain any combination of ‘s’ for
startprob, ‘t’ for transmat, ‘e’ for emissionprob.
Defaults to all parameters.
计算 P ( O ∣ λ ) P(\mathcal{O}|\lambda) P(O∣λ)(问题1)
model.score(obs.T)
Viterbi 算法解码(问题2)
model.decode(obs.T)
Baum-Welch 算法训练(问题3)
model.fit(obs.T)
常见错误
Expected 2D array, got 1D array instead输入的obs
序列需要是一个[n_sample,n_obs]的二维数组,如果只有一个观测序列,需要将其扩展为二维,使用np.atleast_2d(obs)
Buffer dtype mismatch, expected ‘dtype_t’ but got 'float’
由于hmmlearn库在前向传播的时候调用的是C编译的代码,因此如果传入的变量类型不为float64
则会报如上错误。解决办法:将传入的初始化变量类型改为np.float64
。arrays used as indices must be of integer (or boolean) type
传入的obs
序列需要为整型expected a sample from a Multinomial distribution.
源码的解释是:X
should be an array of non-negative integers from range[min(X), max(X)]
, such that each integer from the range occurs inX
at least once. For example[0, 0, 2, 1, 3, 1, 1]
is a valid sample from a Multinomial distribution, while[0, 0, 3, 5, 10]
is not.
也就是说传入的序列需要每个obs都至少出现一次,但这在大多数情况下不太可能,官方GitHub的解决办法是:
>>> from sklearn.preprocessing import LabelEncoder>>> LabelEncoder().fit_transform([0, 1, 5, 10])array([0, 1, 2, 3])
但是这样重新编码不就是失去了一些状态?因此我的解决办法是,注释掉hmm.py
第407
行左右的self._check_input_symbols(X)
语句,不让它对输入进行检查,然后对于第436
行可能出错的代码进行一点小改进
def _compute_log_likelihood(self, X):return np.log(self.emissionprob_)[:, np.concatenate(X)].T
增加一个微小常数,改为
def _compute_log_likelihood(self, X):return np.log(self.emissionprob_ + 1e-10)[:, np.concatenate(X)].T
GitHub
Back to TOC
源代码GitHub传送门
参考资料
Back to TOC
刘建平:隐马尔科夫模型HMMMarslandMLAlgohmmlearn吴良超:隐马尔可夫模型的三大问题及求解方法 Baum-Welch algorithm: Finding parameters for our HMM一站式解决:隐马尔可夫模型(HMM)全过程推导及实现