Regularization (mathematics)_g the ridge regression from baysian point of view-程序员宅基地

技术标签: 统计  

Regularization, in mathematics and statistics and particularly in the fields of machine learning and inverse problems, is a process of introducing additional information in order to solve an ill-posed problemor to prevent overfitting.

Contents

   [hide

Introduction[edit]

In general, a regularization term R(f)R(f) is introduced to a general loss function:

minf∑i=1nV(f(x^i),y^i)+λR(f)\min _{f}\sum _{i=1}^{n}V(f({\hat {x}}_{i}),{\hat {y}}_{i})+\lambda R(f)

for a loss function VV that describes the cost of predicting f(x)f(x) when the label is yy, such as the square loss or hinge loss, and for the term λ\lambda  which controls the importance of the regularization term. R(f)R(f) is typically a penalty on the complexity of ff, such as restrictions for smoothness or bounds on the vector space norm.[1]

A theoretical justification for regularization is that it attempts to impose Occam's razor on the solution, as depicted in the figure. From a Bayesian point of view, many regularization techniques correspond to imposing certain prior distributions on model parameters.

Regularization can be used to learn simpler models, induce models to be sparse, introduce group structure into the learning problem, and more.

The same idea arose in many fields of science. For example, the least-squares method can be viewed as a very simple form of regularization[citation needed]. A simple form of regularization applied to integral equations, generally termed Tikhonov regularization after Andrey Nikolayevich Tikhonov, is essentially a trade-off between fitting the data and reducing a norm of the solution. More recently, non-linear regularization methods, including total variation regularization have become popular.

Generalization[edit]

Main article:  Generalization error

Regularization can be motivated as a technique to improve the generalization of a learned model.

The goal of this learning problem is to find a function that fits or predicts the outcome (label) that minimizes the expected error over all possible inputs and labels. The expected error of a function fnf_{n} is:

I[fn]=∫X×YV(fn(x),y)ρ(x,y)dxdy{\displaystyle I[f_{n}]=\int _{X\times Y}V(f_{n}(x),y)\rho (x,y)\,dx\,dy}

Typically in learning problems, only a subset of input data and labels are available, measured with some noise. Therefore, the expected error is unmeasurable, and the best surrogate available is the empirical error over the NN available samples:

IS[fn]=1n∑i=1NV(fn(x^i),y^i)I_{S}[f_{n}]={\frac {1}{n}}\sum _{i=1}^{N}V(f_{n}({\hat {x}}_{i}),{\hat {y}}_{i})

Without bounds on the complexity of the function space (formally, the reproducing kernel Hilbert space) available, a model will be learned that incurs zero loss on the surrogate empirical error. If measurements (e.g. of xix_{i}) were made with noise, this model may suffer from overfitting and display poor expected error. Regularization introduces a penalty for exploring certain regions of the function space used to build the model, which can improve generalization.

Tikhonov regularization[edit]

Main article:  Tikhonov regularization

When learning a linear function, such that f(x)=w⋅xf(x)=w\cdot x, the L2L_{2} norm loss corresponds to Tikhonov regularization. This is one of the most common forms of regularization, is also known as ridge regression, and is expressed as:

minw∑i=1nV(x^i⋅w,y^i)+λ∥w∥22\min _{w}\sum _{i=1}^{n}V({\hat {x}}_{i}\cdot w,{\hat {y}}_{i})+\lambda \|w\|_{2}^{2}

In the case of a general function, we take the norm of the function in its reproducing kernel Hilbert space:

minf∑i=1nV(f(x^i),y^i)+λ∥f∥H2\min _{f}\sum _{i=1}^{n}V(f({\hat {x}}_{i}),{\hat {y}}_{i})+\lambda \|f\|_{\mathcal {H}}^{2}

As the L2L_{2} norm is differentiable, learning problems using Tikhonov regularization can be solved by gradient descent.

Tikhonov regularized least squares[edit]

The learning problem with the least squares loss function and Tikhonov regularization can be solved analytically. Written in matrix form, the optimal ww will be the one for which the gradient of the loss function with respect to ww is 0.

minw1n(X^w−Y)T(X^w−Y)+λ∥w∥22{\displaystyle \min _{w}{\frac {1}{n}}({\hat {X}}w-Y)^{T}({\hat {X}}w-Y)+\lambda \|w\|_{2}^{2}}
∇w=2nX^T(X^w−Y)+2λw{\displaystyle \nabla _{w}={\frac {2}{n}}{\hat {X}}^{T}({\hat {X}}w-Y)+2\lambda w}        \leftarrow This is the  first-order condition for this optimization problem
0=X^T(X^w−Y)+nλw{\displaystyle 0={\hat {X}}^{T}({\hat {X}}w-Y)+n\lambda w}
w=(X^TX^+λnI)−1(X^TY){\displaystyle w=({\hat {X}}^{T}{\hat {X}}+\lambda nI)^{-1}({\hat {X}}^{T}Y)}

By construction of the optimization problem, other values of ww would give larger values for the loss function. This could be verified by examining the second derivative ∇ww{\displaystyle \nabla _{ww}}.

During training, this algorithm takes O(d3+nd2)O(d^{3}+nd^{2}) time. The terms correspond to the matrix inversion and calculating XTXX^{T}X, respectively. Testing takes O(nd)O(nd) time.

Early stopping[edit]

Main article:  Early stopping

Early stopping can be viewed as regularization in time. Intuitively, a training procedure like gradient descent will tend to learn more and more complex functions as the number of iterations increases. By regularizing on time, the complexity of the model can be controlled, improving generalization.

In practice, early stopping is implemented by training on a training set and measuring accuracy on a statistically independent validation set. The model is trained until performance on the validation set no longer improves. The model is then tested on a testing set.

Theoretical motivation in least squares[edit]

Consider the finite approximation of Neumann series for an invertible matrix AA where ∥I−A∥<1{\displaystyle \|I-A\|<1}:

∑i=0T−1(I−A)i≈A−1\sum _{i=0}^{T-1}(I-A)^{i}\approx A^{-1}

This can be used to approximate the analytical solution of unregularized least squares, if γ\gamma  is introduced to ensure the norm is less than one.

wT=γn∑i=0T−1(I−γnX^TX^)iX^TY^w_{T}={\frac {\gamma }{n}}\sum _{i=0}^{T-1}(I-{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {X}})^{i}{\hat {X}}^{T}{\hat {Y}}

The exact solution to the unregularized least squares learning problem will minimize the empirical error, but may fail to generalize and minimize the expected error. By limiting TT, the only free parameter in the algorithm above, the problem is regularized on time which may improve its generalization.

The algorithm above is equivalent to restricting the number of gradient descent iterations for the empirical risk

Is[w]=12n∥X^w−Y^∥Rn2I_{s}[w]={\frac {1}{2n}}\|{\hat {X}}w-{\hat {Y}}\|_{\mathbb {R} ^{n}}^{2}

with the gradient descent update:

w0=0w_{0}=0
wt+1=(I−γnX^TX^)wt+γnX^TY^w_{t+1}=(I-{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {X}})w_{t}+{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {Y}}

The base case is trivial. The inductive case is proved as follows:

wT=(I−γnX^TX^)γn∑i=0T−2(I−γnX^TX^)iX^TY^+γnX^TY^w_{T}=(I-{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {X}}){\frac {\gamma }{n}}\sum _{i=0}^{T-2}(I-{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {X}})^{i}{\hat {X}}^{T}{\hat {Y}}+{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {Y}}
wT=γn∑i=1T−1(I−γnX^TX^)iX^TY^+γnX^TY^w_{T}={\frac {\gamma }{n}}\sum _{i=1}^{T-1}(I-{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {X}})^{i}{\hat {X}}^{T}{\hat {Y}}+{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {Y}}
wT=γn∑i=0T−1(I−γnX^TX^)iX^TY^w_{T}={\frac {\gamma }{n}}\sum _{i=0}^{T-1}(I-{\frac {\gamma }{n}}{\hat {X}}^{T}{\hat {X}})^{i}{\hat {X}}^{T}{\hat {Y}}

Regularizers for sparsity[edit]

Assume that a dictionary ϕj\phi _{j} with dimension pp is given such that a function in the function space can be expressed as:

f(x)=∑j=1pϕj(x)wjf(x)=\sum _{j=1}^{p}\phi _{j}(x)w_{j}
A comparison between the L1 ball and the L2 ball in two dimensions gives an intuition on how L1 regularization achieves sparsity.

Enforcing a sparsity constraint on ww can lead to simpler and more interpretable models. This is useful in many real-life applications such as computational biology. An example is developing a simple predictive test for a disease in order to minimize the cost of performing medical tests while maximizing predictive power.

A sensible sparsity constraint is the L0L_{0} norm ∥w∥0\|w\|_{0}, defined as the number of non-zero elements in ww. Solving a L0L_{0}regularized learning problem, however, has been demonstrated to be NP-hard.[2]

The L1L_{1} norm can be used to approximate the optimal L0L_{0} norm via convex relaxation. It can be shown that the L1L_{1} norm induces sparsity. In the case of least squares, this problem is known as LASSO in statistics and basis pursuit in signal processing.

minw∈Rp1n∥X^w−Y^∥2+λ∥w∥1\min _{w\in \mathbb {R} ^{p}}{\frac {1}{n}}\|{\hat {X}}w-{\hat {Y}}\|^{2}+\lambda \|w\|_{1}
Elastic net regularization

L1L_{1} regularization can occasionally produce non-unique solutions. A simple example is provided in the figure when the space of possible solutions lies on a 45 degree line. This can be problematic for certain applications, and is overcome by combining L1L_{1}with L2L_{2} regularization in elastic net regularization, which takes the following form:

minw∈Rp1n∥X^w−Y^∥2+λ(α∥w∥1+(1−α)∥w∥22),α∈[0,1]\min _{w\in \mathbb {R} ^{p}}{\frac {1}{n}}\|{\hat {X}}w-{\hat {Y}}\|^{2}+\lambda (\alpha \|w\|_{1}+(1-\alpha )\|w\|_{2}^{2}),\alpha \in [0,1]

Elastic net regularization tends to have a grouping effect, where correlated input features are assigned equal weights.

Elastic net regularization is commonly used in practice and is implemented in many machine learning libraries.

Proximal methods[edit]

Main article:  Proximal gradient method

While the L1L_{1} norm does not result in an NP-hard problem, it should be noted that the L1L_{1} norm is convex but is not strictly diffentiable due to the kink at x = 0. Subgradient methods which rely on the subderivative can be used to solve L1L_{1} regularized learning problems. However, faster convergence can be achieved through proximal methods.

For a problem minw∈HF(w)+R(w)\min _{w\in H}F(w)+R(w) such that FF is convex, continuous, differentiable, with Lipschitz continuous gradient (such as the least squares loss function), and RR is convex, continuous, and proper, then the proximal method to solve the problem is as follows. First define the proximal operator

proxR⁡(v)=argminw∈RD⁡{R(w)+12∥w−v∥2},{\displaystyle \operatorname {prox} _{R}(v)=\operatorname {argmin} \limits _{w\in \mathbb {R} ^{D}}\{R(w)+{\frac {1}{2}}\|w-v\|^{2}\},}

and then iterate

wk+1=proxγ,R⁡(wk−γ∇F(wk)){\displaystyle w_{k+1}=\operatorname {prox} \limits _{\gamma ,R}(w_{k}-\gamma \nabla F(w_{k}))}

The proximal method iteratively performs gradient descent and then projects the result back into the space permitted by RR.

When RR is the L1L_{1} regularizer, the proximal operator is equivalent to the soft-thresholding operator,

Sλ(v)f(n)={vi−λ,if vi>λ0,if vi∈[−λ,λ]vi+λ,if vi<−λ{\displaystyle S_{\lambda }(v)f(n)={\begin{cases}v_{i}-\lambda ,&{\text{if }}v_{i}>\lambda \\0,&{\text{if }}v_{i}\in [-\lambda ,\lambda ]\\v_{i}+\lambda ,&{\text{if }}v_{i}<-\lambda \end{cases}}}

This allows for efficient computation.

Group sparsity without overlaps[edit]

Groups of features can be regularized by a sparsity constraint, which can be useful for expressing certain prior knowledge into an optimization problem.

In the case of a linear model with non-overlapping known groups, a regularizer can be defined:

R(w)=∑g=1G∥wg∥g,{\displaystyle R(w)=\sum _{g=1}^{G}\|w_{g}\|_{g},} where  ∥wg∥g=∑j=1|Gg|(wgj)2\|w_{g}\|_{g}={\sqrt {\sum _{j=1}^{|G_{g}|}(w_{g}^{j})^{2}}}

This can be viewed as inducing a regularizer over the L2L_{2} norm over members of each group followed by an L1L_{1} norm over groups.

This can be solved by the proximal method, where the proximal operator is a block-wise soft-thresholding function:

(proxλ,R,g⁡(wg))j={(wgj−λwgj∥wg∥g),if ∥wg∥g>λ0if ∥wg∥g∈[−λ,λ](wgj+λwgj∥wg∥g),if ∥wg∥g<−λ{\displaystyle (\operatorname {prox} \limits _{\lambda ,R,g}(w_{g}))^{j}={\begin{cases}(w_{g}^{j}-\lambda {\frac {w_{g}^{j}}{\|w_{g}\|_{g}}}),&{\text{if }}\|w_{g}\|_{g}>\lambda \\0&{\text{if }}\|w_{g}\|_{g}\in [-\lambda ,\lambda ]\\(w_{g}^{j}+\lambda {\frac {w_{g}^{j}}{\|w_{g}\|_{g}}}),&{\text{if }}\|w_{g}\|_{g}<-\lambda \end{cases}}}

Group sparsity with overlaps[edit]

The algorithm described for group sparsity without overlaps can be applied to the case where groups do overlap, in certain situations. It should be noted that this will likely result in some groups with all zero elements, and other groups with some non-zero and some zero elements.

If it is desired to preserve the group structure, a new regularizer can be defined:

R(w)=inf{∑g=1G∥wg∥g:w=∑g=1Gw¯g}{\displaystyle R(w)=\inf \left\{\sum _{g=1}^{G}\|w_{g}\|_{g}:w=\sum _{g=1}^{G}{\bar {w}}_{g}\right\}}

For each wgw_{g}w¯g{\bar {w}}_{g} is defined as the vector such that the restriction of w¯g{\bar {w}}_{g} to the group gg equals wgw_{g} and all other entries of w¯g{\bar {w}}_{g} are zero. The regularizer finds the optimal disintegration of ww into parts. It can be viewed as duplicating all elements that exist in multiple groups. Learning problems with this regularizer can also be solved with the proximal method with a complication. The proximal operator cannot be computed in closed form, but can be effectively solved iteratively, inducing an inner iteration within the proximal method iteration.

Regularizers for semi-supervised learning[edit]

Main article:  Semi-supervised learning

When labels are more expensive to gather than input examples, semi-supervised learning can be useful. Regularizers have been designed to guide learning algorithms to learn models that respect the structure of unsupervised training samples. If a symmetric weight matrix WW is given, a regularizer can be defined:

R(f)=∑i,jwij(f(xi)−f(xj))2R(f)=\sum _{i,j}w_{ij}(f(x_{i})-f(x_{j}))^{2}

If WijW_{ij} encodes the result of some distance metric for points xix_{i} and xjx_{j}, it is desirable that f(xi)≈f(xj)f(x_{i})\approx f(x_{j}). This regularizer captures this intuition, and is equivalent to:

R(f)=f¯TLf¯R(f)={\bar {f}}^{T}L{\bar {f}} where  L=D−WL=D-W is the  Laplacian matrix of the graph induced by  WW.

The optimization problem minf∈RmR(f),m=u+l\min _{f\in \mathbb {R} ^{m}}R(f),m=u+l can be solved analytically if the constraint f(xi)=yif(x_{i})=y_{i} is applied for all supervised samples. The labeled part of the vector ff is therefore obvious. The unlabeled part of ff is solved for by:

minfu∈RufTLf=minfu∈Ru{fuTLuufu+flTLlufu+fuTLulfl}\min _{f_{u}\in \mathbb {R} ^{u}}f^{T}Lf=\min _{f_{u}\in \mathbb {R} ^{u}}\{f_{u}^{T}L_{uu}f_{u}+f_{l}^{T}L_{lu}f_{u}+f_{u}^{T}L_{ul}f_{l}\}
∇fu=2Luufu+2LulY\nabla _{f_{u}}=2L_{uu}f_{u}+2L_{ul}Y
fu=Luu†(LulY)f_{u}=L_{uu}^{\dagger }(L_{ul}Y)

Note that the pseudo-inverse can be taken because Lul{\displaystyle L_{ul}} has the same range as LuuL_{​{uu}}.

Regularizers for multitask learning[edit]

Main article:  Multi-task learning

In the case of multitask learning, TT problems are considered simultaneously, each related in some way. The goal is to learn TT functions, ideally borrowing strength from the relatedness of tasks, that have predictive power. This is equivalent to learning the matrix W:T×DW:T\times D .

Sparse regularizer on columns[edit]

R(w)=∑i=1D∥W∥2,1R(w)=\sum _{i=1}^{D}\|W\|_{2,1}

This regularizer defines an L2 norm on each column and an L1 norm over all columns. It can be solved by proximal methods.

Nuclear norm regularization[edit]

R(w)=∥σ(W)∥1R(w)=\|\sigma (W)\|_{1} where  σ(W)\sigma (W) is the eigenvalues in the  singular value decomposition of  WW.

Mean-constrained regularization[edit]

R(f1⋯fT)=∑t=1T∥ft−1T∑s=1Tfs∥Hk2{\displaystyle R(f_{1}\cdots f_{T})=\sum _{t=1}^{T}\|f_{t}-{\frac {1}{T}}\sum _{s=1}^{T}f_{s}\|_{H_{k}}^{2}}

This regularizer constrains the functions learned for each task to be similar to the overall average of the functions across all tasks. This is useful for expressing prior information that each task is expected to share similarities with each other task. An example is predicting blood iron levels measured at different times of the day, where each task represents a different person.

Clustered mean-constrained regularization[edit]

R(f1⋯fT)=∑r=1C∑t∈I(r)∥ft−1I(r)∑s∈I(r)fs∥Hk2{\displaystyle R(f_{1}\cdots f_{T})=\sum _{r=1}^{C}\sum _{t\in I(r)}\|f_{t}-{\frac {1}{I(r)}}\sum _{s\in I(r)}f_{s}\|_{H_{k}}^{2}} where  I(r)I(r) is a cluster of tasks.

This regularizer is similar to the mean-constrained regularizer, but instead enforces similarity between tasks within the same cluster. This can capture more complex prior information. This technique has been used to predict Netflix recommendations. A cluster would correspond to a group of people who share similar preferences in movies.

Graph-based similarity[edit]

More general than above, similarity between tasks can be defined by a function. The regularizer encourages the model to learn similar functions for similar tasks.

R(f1⋯fT)=∑t,s=1,t≠sT∥ft−fs∥2Mts{\displaystyle R(f_{1}\cdots f_{T})=\sum _{t,s=1,t\neq s}^{T}\|f_{t}-f_{s}\|^{2}M_{ts}} for a given symmetric similarity matrix  MM.

Other uses of regularization in statistics and machine learning[edit]

Bayesian learning methods make use of a prior probability that (usually) gives lower probability to more complex models. Well-known model selection techniques include the Akaike information criterion (AIC), minimum description length (MDL), and the Bayesian information criterion (BIC). Alternative methods of controlling overfitting not involving regularization include cross-validation.

Examples of applications of different methods of regularization to the linear model are:

Model Fit measure Entropy measure[1][3]
AIC/BIC ∥Y−Xβ∥2\|Y-X\beta \|_{2} ∥β∥0\|\beta \|_{0}
Ridge regression[4] ∥Y−Xβ∥2\|Y-X\beta \|_{2} ∥β∥2\|\beta \|_{2}
Lasso[5] ∥Y−Xβ∥2\|Y-X\beta \|_{2} ∥β∥1\|\beta \|_{1}
Basis pursuit denoising ∥Y−Xβ∥2\|Y-X\beta \|_{2} λ∥β∥1\lambda \|\beta \|_{1}
Rudin–Osher–Fatemi model (TV) ∥Y−Xβ∥2\|Y-X\beta \|_{2} λ∥∇β∥1\lambda \|\nabla \beta \|_{1}
Potts model ∥Y−Xβ∥2\|Y-X\beta \|_{2} λ∥∇β∥0\lambda \|\nabla \beta \|_{0}
RLAD[6] ∥Y−Xβ∥1\|Y-X\beta \|_{1} ∥β∥1\|\beta \|_{1}
Dantzig Selector[7] ∥X⊤(Y−Xβ)∥∞\|X^{\top }(Y-X\beta )\|_{\infty } ∥β∥1\|\beta \|_{1}
SLOPE[8] ∥Y−Xβ∥2\|Y-X\beta \|_{2} ∑i=1pλi|β|(i)\sum _{i=1}^{p}\lambda _{i}|\beta |_{(i)}

See also[edit]

Notes[edit]

  1. Jump up to: a b Bishop, Christopher M. (2007). Pattern recognition and machine learning (Corr. printing. ed.). New York: Springer. ISBN 978-0387310732.
  2. Jump up ^ Natarajan, B. (1995-04-01). "Sparse Approximate Solutions to Linear Systems"SIAM Journal on Computing24 (2): 227–234. doi:10.1137/S0097539792240406ISSN 0097-5397.
  3. Jump up ^ Duda, Richard O. (2004). Pattern classification + computer manual : hardcover set (2. ed.). New York [u.a.]: Wiley. ISBN 978-0471703501.
  4. Jump up ^ Arthur E. Hoerl; Robert W. Kennard (1970). "Ridge regression: Biased estimation for nonorthogonal problems". Technometrics12 (1): 55–67. doi:10.2307/1267351.
  5. Jump up ^ Tibshirani, Robert (1996). "Regression Shrinkage and Selection via the Lasso" (PostScript)Journal of the Royal Statistical Society, Series B58 (1): 267–288. MR 1379242. Retrieved 2009-03-19.
  6. Jump up ^ Li Wang, Michael D. Gordon & Ji Zhu (2006). "Regularized Least Absolute Deviations Regression and an Efficient Algorithm for Parameter Tuning". Sixth International Conference on Data Mining. pp. 690–700. doi:10.1109/ICDM.2006.134.
  7. Jump up ^ Candes, EmmanuelTao, Terence (2007). "The Dantzig selector: Statistical estimation when p is much larger than n". Annals of Statistics35 (6): 2313–2351. arXiv:math/0506081Freely accessibledoi:10.1214/009053606000001523MR 2382644.
  8. Jump up ^ Małgorzata Bogdan, Ewout van den Berg, Weijie Su & Emmanuel J. Candes (2013). "Statistical estimation and testing via the ordered L1 norm". arXiv:1310.1969Freely accessible.

References[edit]


版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/ThinkingDou/article/details/68922050

智能推荐

swagger使用map传参和实体传参编写注释_swagger页面get请求传map参数怎么写-程序员宅基地

文章浏览阅读1.7w次,点赞2次,收藏20次。@AutoLog(value = "web首页-地图显示按行政区划")@ApiOperation(value="web首页-地图显示按行政区划")@ApiImplicitParams({@ApiImplicitParam(paramType = "query",name = "orgType",value ="行政等级",dataType ="String"), ..._swagger页面get请求传map参数怎么写

【进阶版】机器学习之神经网络与深度学习基本知识和理论原理(07)_深度随机配置网络-程序员宅基地

文章浏览阅读931次。机器学习算法知识、数据预处理、特征工程、模型评估——原理+案例+代码实战机器学习之Python开源教程——专栏介绍及理论知识概述机器学习框架及评估指标详解Python监督学习之分类算法的概述数据预处理之数据清理,数据集成,数据规约,数据变化和离散化特征工程之One-Hot编码、label-encoding、自定义编码卡方分箱、KS分箱、最优IV分箱、树结构分箱、自定义分箱特征选取之单变量统计、基于模型选择、迭代选择机器学习八大经典分类万能算法——代码+案例项目开源、可直接应用于毕设+科研项目。_深度随机配置网络

python logging模块“另一个程序正在使用此文件,进程无法访问。”问题解决办法...-程序员宅基地

文章浏览阅读2.7k次。在多进程下使用python的logging模块,经常会遇到“另一个程序正在使用此文件,进程无法访问。”的错误。解决办法: https://github.com/Preston-Landers/concurrent-log-handlerpip install concurrent-log-handler To use this module from a logging config..._logging roatingfilehander 另一个程序正在使用此文件,进程无法访问

SPA与DPA 攻击【转】-程序员宅基地

文章浏览阅读624次。转自:http://blog.sina.com.cn/s/blog_6cb58dbf0102v7ym.htmlSPASPA是一种直接解释能量消耗测定值的技术。系统消耗能量的大小随微处理器执行的指令不同而不同, 当微处理器在对加密算法的不同部分执行运算时, 能量消耗变化有的会很明显。借助这种特征, 攻击者能区分出单条指令, 达到破解算法的目的。DPA..._spa攻击

python 对话框的创建及调用_Python使用tkinter界面编程中对话框样式汇总-程序员宅基地

文章浏览阅读625次。在GUI编程中,对话框是用户交互和检索信息的重要控件。今天,我们对tkinter中常用的对话框进行汇总。tkinter模块的子模块messagebox、filedialog、colorchooser、simpledialog中包括了一些常用的预定义好的对话框,当然也可以通过继承Toplevel创建自定义的对话框。如果对于界面显示没有太严苛的要求的话,建议还是使用预定义的对话框,无论从功能还是容错机..._位于tkinter模块中的子模块___ 、___ 、___ 和___ ,包括通用 的预定义对话框;用户

【OpenCV】初识OpenCV(简介、windows下安装及其开发部署)_windows安装opencv-程序员宅基地

文章浏览阅读7k次,点赞9次,收藏65次。详细介绍了OpenCV及其在windows编译环境中的详细安装部署_windows安装opencv

随便推点

爬取唐诗-程序员宅基地

文章浏览阅读843次。首先我们打开唐诗三百首网页1 http://www.gushiwen.org/gushi/tangshi.aspx目标分析:1、爬取网页七大板块:五言绝句,七言绝句,五言律诗,七言律诗,五言古诗,七言古诗,乐府。2、爬取每个板块的所有古诗。3、爬取每个古诗词内容。网页详情如下:我们很容易就能发现,每一个分类都是包裹在:1 <div i..._获取唐诗三百首接口

Laravel学习:请求到响应的生命周期-程序员宅基地

文章浏览阅读93次。Laravel请求到响应的整个执行过程,主要可以归纳为四个阶段,即程序启动准备阶段、请求实例化阶段、请求处理阶段、响应发送和程序终止阶段。程序启动准备阶段服务容器实例化服务容器的实例化和基本注册,包括了服务容器本身注册、基础服务提供者注册、核心类别名注册和应用的基本路径注册。注册的服务只是具体的类名,是通过反射机制来实例化对象,并且..._laravel new client() 响应时长

思科 下一跳_二层交换机下一跳命令-程序员宅基地

文章浏览阅读3.1k次,点赞2次,收藏6次。命令如下 仅供参考A :Router>enable Router#conf terminal Enter configuration commands, one per line. End with CNTL/Z.Router(config)#interface gigabitEthernet 0/0Router(config-if)#ip address 192.168.1..._二层交换机下一跳命令

IFeatureClass接口_list<ifeatureclass>-程序员宅基地

文章浏览阅读3.7k次。IFeatureClass用于访问控制要素类行为和属性的成员IFeatureClass接口是获取和设置要素类属性的主要接口。例如,使用IFeatureClass接口获取要素类类型、获取满足查询条件的要素数目或在要素类中创建新要素。IFeatureClass接口继承了IObjectClass接口。成员AddField 向这个类中添加一个字段。AddIndex _list

利用Mysql into outfile给网站留后门-程序员宅基地

文章浏览阅读9.6k次。Mysql into outfile使用Mysql into outfile语句,可以方便导出表格的数据。同样也可以生成某些文件。因此有些人会利用sql注入生成特定代码的文件,然后执行这些文件。将会造成严重的后果。Mysql into outfile 生成PHP文件SELECT 0x3C3F7068702073797374656D28245F524551554553545B636D645D293B3

商业智能软件对比评测: FineBI 和 Tableau -程序员宅基地

文章浏览阅读358次。FineBI和Tableau是比较好的自助式商业智能软件,功能都很强大,是企业数据可视化不可或缺的利器,但两款产品还是有非常大的区别的,例如Tableau的功能全面且深入,更适合专业的数据分析人员,而FineBI则是面向普通的业务人员,数据分析过程更人性化,更简单和易用,并为企业提供了全面的数据管理和用户管理策略。下面对这两款商业智能软件做个对比评测。一、产品理念FineBI是帆软公司推出的自助..._centos7安装finebi

推荐文章

热门文章

相关标签