theano使用-猿圈-程序猿的知识社区

一 theano内置数据类型

只有thenao.shared()类型才有get_value()成员函数（返回numpy.ndarray）？

1. 惯常处理

x = T.matrix('x')  # the data is presented as rasterized images
y = T.ivector('y') # the labels are presented as 1D vector of [int] labels

# reshape matrix of rasterized images of shape 
# (batch_size, 28*28) to a 4D tensor, 使其与LeNetConvPoolLayer相兼容
layer0_input = x.reshape((batch_size, 1, 28, 28))

>>> x.reshape((500, 3, 28, 28))
TensorType(float64, 4D)

>>> x.type
TensorType(float64, matrix)
>>> layer0_input.type
TensorType(float64, (False, True, False, False))
            # 布尔值表示是否可被broadcast
>>> x.reshape((500, 3, 28, 28)).type
TensorType(float64, 4D)
>>> T.dtensor4().type
TensorType(float64, 4D)

2. theano.shared 向 numpy.ndarray 的转换

# train_set_x: theano.shared()类型
train_set_x.get_value(borrow=True)
        # 返回的正是ndarray类型，borrow=True表示返回的是“引用”
train_set_x.get_value(borrow=True).shape[0]

3. built-in data types

查阅theano完备的文档，我们知：

theano所内置的数据类型主要位于theano.tensor子模块下，

import theano.tensor as T

以b开头，表示byte类型（bscalar, bvector, bmatrix, brow, bcol, btensor3,btensor4）
以w开头，表示16-bit integers（wchar）（wscalar, wvector, wmatrix, wrow, wcol, wtensor3, wtensor4）
以i开头，表示32-bit integers（int）（iscalar, ivector, imatrix, irow, icol, itensor3, itensor4）
以l开头，表示64-bit integers（long）（lscalar, lvector, lmatrix, lrow, lcol, ltensor3, ltensor4）
以f开头，表示float类型（fscalar, fvector, fmatrix, fcol, frow, ftensor3, ftensor4）
以d开头，表示double类型（dscalar, dvector, dmatrix, dcol, drow, dtensor3, dtensor4）
以c开头，表示complex类型（cscalar, cvector, cmatrix, ccol, crow, ctensor3, ctensor4）

这里的tensor3/4类型也不神秘，

scalar：0-dim ndarray
vector：1-dim ndarray
matrix：2-dim ndarray
tensor3：3-dim ndarray
tensor4：4-dim ndarray

注意以上这些类型的类型都是theano.tensor.var.TensorVariable

>>> x = T.iscalar('x')
>>> type(x)
theano.tensor.var.TensorVariable
>>> x.type
TensorType(int32, scalar)

我们继续考察tensor：

>>> x = T.dmatrix()
>>> x.type

>>> x = T.matrix()
>>> x.type

在设计经典的卷积神经网络（CNN）时，在输入层和第一个隐层之间需要加一个卷积的动作，对应的api是theano.tensor.signal.conv.conv2d，其主要接受两个符号型输入symbolic inputs：

一个4d的tensor对应于mini_batch的输入图像：

mini_batch_size
# of feature input maps
image height
image width

一个4d的tensor对应于权值矩阵 $W$

# of feature output maps（也即 # of filters）
# of feature input maps
filter height
filter width

rng = np.random.RandomState(23455)
input = T.dtensor4('input')
w_shp = (2, 3, 9, 9)
            # 3 means: rgb, 图像的三种颜色分量
w_bound = np.sqrt(np.prod(w_shp[1:]))
W = theano.shared(np.asarray(rng.uniform(low=-1./w_bound, high=1./w_bound, size=w_shp), dtype=input.dtype), name='W')
conv_out = conv.conv2d(input, W)
import numpy
import theano.tensor as T
from theano import function
x = T.dscalar('x')    
y = T.dscalar('y')
z = x + y
f = function([x, y], z)
numpy.allclose(f(16.3, 12.1), 28.4)     输出为true
numpy.allclose(z.eval({x:16.3, y:12.1}, 28.4))    输出为true

二 theano学习

tensor:高维数组，T 里面其实有scalar （一个数据点），vector (向量），matrix (矩阵），tensor3 (三维矩阵)，tensor4 （四位矩阵）这些都落入tensor的范畴。

dscalar:不是一个类，是一个TensorVariable实例。特别的，T.dscalar指：doubles(d)型的0维arrays(scalar)。

pp:一个函数，from theano import pp print(pp(z)) 则pretty-print 关于z的计算：输出（x+y）.

以下为具体类型（theano 0.8.2）：

import theano
a = theano.tensor.vector()   # 引入tensor中的vector型
out = a + a**10
f = theano.function([a], out)
print(f([0,1,2]))            # 输出[0.   2. 1026.]

logistics代码：

import theano
import theano.tensor as T
x = T.dmatrix('x')
s = 1/(1 + T.exp(-x))
logistic = theano.function([x], s)
logistic([[0, 1],[-1, -2]])       # 输出array([[0.5         ,0.73105858],
                                               [0.26894142 , 0.11920292]])

一次计算多项：

>>> a, b = T.dmatrices('a', 'b')             # dmatrices 提供多个输出，这是声明多变量的一个捷径
>>> diff = a - b
>>> abs_diff = abs(diff)
>>> diff_squared = diff**2
>>> f = theano.function([a, b], [diff, abs_diff, diff_squared])
  >>> f([[1, 1], [1, 1]], [[0, 1], [2, 3]])
     [array([[ 1., 0.],
            [-1., -2.]]), array([[ 1., 0.],
            [ 1., 2.]]), array([[ 1., 0.],
            [ 1., 4.]])]

为参数设定默认值，引入function中的参数In

>>> from theano import In
>>> from theano import function
>>> x, y = T.dscalars('x', 'y')
>>> z = x + y
>>> f = function([x, In(y, value=1)], z)          # 引入类In：允许你为函数参数进行更多细节上的特定化
>>> f(33)
array(34.0)
>>> f(33, 2)
array(35.0)


>>> x, y, w = T.dscalars('x', 'y', 'w')
>>> z = (x + y) * w
>>> f = function([x, In(y, value=1), In(w, value=2, name='w_by_name')], z)        # 注意这里引入name
>>> f(33)
array(68.0)
>>> f(33, 2)
array(70.0)
>>> f(33, 0, 1)
array(33.0)
>>> f(33, w_by_name=1)
array(34.0)
>>> f(33, w_by_name=1, y=0)
array(33.0)

利用共享变量（Shared Variables）

例如我们想造一个累加器，开始初始化为0，随着函数每被调用一次，累加器通过函数声明进行叠加。shared函数构造了一个称为 shared vairables的结构，其值被很多函数共享，其值可以通过调用.get_value()来access,通过.set_value()来modified.

另一个说明：在function中引入参数updates .function.updates必须以pairs（shared-variable, new expression）的列表形式提供，当然形式也可以是字典（其键为shared-variables，值为new expression）。顾名思义，update就是用后面的值代替前面的值。

代码：

>>> from theano import shared
>>> state = shared(0)
>>> inc = T.iscalar('inc')
>>> accumulator = function([inc], state, updates=[(state, state+inc)])

>>> print(state.get_value())
0
>>> accumulator(1)
array(0)
>>> print(state.get_value())
1
>>> accumulator(300)
array(1)
>>> print(state.get_value())
301

>>> state.set_value(-1)
>>> accumulator(3)
array(-1)
>>> print(state.get_value())
2                                            # 此时共享变量值为2，注意下文

  >>> decrementor = function([inc], state, updates=[(state, state-inc)])          # 定义另一个函数来共享shared variable
  >>> decrementor(2)                                                              # 给inc赋值为2
  array(2)                                                                        # 此时输出共享变量值还为2，注意上文
  >>> print(state.get_value())                                                    # update 将state更新为0
  0

利用function中参数givens

givens参数被用来替代任何符号变量，不仅仅是共享变量，你可以用来替代常量，表达式。注意不要引入一个互相依赖的替代品，因为替代者的顺序没有定义，所以他们会以任意顺序工作。实际中，可以将givens看作一种机制：允许你用不同的表示方法（evaluates to a tensor of same shape and dtype，相同的尺寸和类型）替代你的任何公式。

>>> fn_of_state = state * 2 + inc
>>> # The type of foo must match the shared variable we are replacing
>>> # with the ``givens``
>>> foo = T.scalar(dtype=state.dtype)                                                 # 因为下文要用foo代替state，所以要获得相同类型
>>> skip_shared = function([inc, foo], fn_of_state, givens=[(state, foo)])            # 这里用foo代替state！
>>> skip_shared(1, 3) # we're using 3 for the state, not state.value                  # 这里的1 赋值给了inc， 3赋值给了foo， 在计算中，用foo代替了state 
array(7)                                                                              # state *2+inc变为 foo *2+inc ，所以为7
>>> print(state.get_value()) # old state still there, but we didn't use it            # state 值没变，所以仍然为0
0

copy 函数

> import theano
>>> import theano.tensor as T
>>> state = theano.shared(0)
>>> inc = T.iscalar('inc')
>>> accumulator = theano.function([inc], state, updates=[(state, state+inc)],on_unused_input='ignore')
>>> accumulator(10)
array(0)
>>> print(state.get_value())
10


>>> new_state = theano.shared(0)
>>> new_accumulator = accumulator.copy(swap={state:new_state})               # 利用swap参数将new_state替代原accumulate中的state
>>> new_accumulator(100)
[array(0)]
>>> print(new_state.get_value())
100


>>> print(state.get_value())                                                 # 原函数中的state值未变
10


>>> null_accumulator = accumulator.copy(delete_updates=True)                 # 再定义一个新的accumulator函数，新函数移除掉了update
  >>> null_accumulator(9000)                                      
  [array(10)]
  >>> print(state.get_value())                                                 # 这个新函数没有了uodates功能，同时也不再使用参数 inc
  10                                                                           # 如果没有移除updates，则值应该为9010。移除后，只剩state的值

随机数 Random Numbers

from theano.tensor.shared_randomstreams import RandomStreams
from theano import function
srng = RandomStreams(seed=234)
rv_u = srng.uniform((2,2))                        # 服从联合分布（uniform distribution）的2*2的随机矩阵
rv_n = srng.normal((2,2))                         # 服从正态分布（normal distribution）的2*2的随机矩阵
f = function([], rv_u) 
g = function([], rv_n, no_default_updates=True) #Not updating rv_n.rng   #不再更新rv_n，即不管调用几次，这个值不变
nearly_zeros = function([], rv_u + rv_u - 2 * rv_u)  # remark：一个随机变量在简单函数里只生成一次，所以这个函数值虽然有三次rv_u，但是函数值应该为零！

  >>> f_val0 = f()
  >>> f_val1 = f() #different numbers from f_val0      # 两次调用，两种不同结果 

 

  >>> g_val0 = g() # different numbers from f_val0 and f_val1
  >>> g_val1 = g() # same numbers as g_val0!           # 两次调用，两种相同结果

补充：随机抽样（numpy.random）

rand(d0,d1,...,dn) >>>np.random.rand(a,b) a*b矩阵随机值

randn(d0,d1,...,dn) >>>np.random.randn() 返回一个标准正态分布的样本

randint(low[,high,size]) >>>np.random.randint(2, size=10) 1*10维整型数组，最大值小于2 开区间

>>>np.random.randint(size=10, low=0, high=3) 1*10维整型数组，最低可取0，最大不可取3

random_integers(low[,high,size]) >>>np.random.random_integers(5, size=(3.,2.)) 用法同randint， 闭区间

random_sample([size])、random([size])、ranf([size])、sample([size]) 返回半开区间 [0.0， 1.0) 的随机浮点数

choice(a[,size,replace,p]) >>>np.random.choice(5,3) 最大为4，数目为3的一个随机数组

>>>np.random.choice(5,3,p=[0.1, 0, 0.3, 0.6, 0]) Generate a non-uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3, replace=False) array([3,1,0])

Generate a uniform random sample from np.arange(5) of size 3 without replacement

>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) array([2, 3, 0])

Generate a non-uniform random sample from np.arange(5) of size 3 without replacement

bytes: 返回随机字节 >>> np.random.bytes(10) ‘ eh\x85\x022SZ\xbf\xa4‘ #random

关于排列：

shuffle(x): 现场修改序列，改变自身内容。（类似洗牌，打乱顺序）

>>> arr = np.arange(10)
>>> np.random.shuffle(arr)
>>> arr
[1 7 5 2 9 4 3 6 0 8]

This function only shuffles the array along the first index of a multi-dimensional array:

>>> arr = np.arange(9).reshape((3, 3))
>>> np.random.shuffle(arr)
>>> arr
array([[3, 4, 5],
       [6, 7, 8],
       [0, 1, 2]])

permutation(x):返回一个随机排列

>>> np.random.permutation(10)
array([1, 7, 4, 3, 0, 9, 2, 5, 8, 6])
>>> np.random.permutation([1, 4, 9, 12, 15])
array([15,  1,  9,  4, 12])
>>> arr = np.arange(9).reshape((3, 3))
>>> np.random.permutation(arr)
array([[6, 7, 8],
       [0, 1, 2],
       [3, 4, 5]])

有了以上知识，理解theano 0.8.2中关于logistics的经典例子不成问题：

import numpy
import theano
import theano.tensor as T
rng = numpy.random
N = 400     # training sample size
feats = 784 # number of input variables
# generate a dataset: D = (input_values, target_class)
D = (rng.randn(N, feats), rng.randint(size=N, low=0, high=2))
training_steps = 10000
# Declare Theano symbolic variables
x = T.dmatrix("x")
y = T.dvector("y")
# initialize the weight vector w randomly
# this and the following bias variable b
# are shared so they keep their values
# between training iterations (updates)
w = theano.shared(rng.randn(feats), name="w")
# initialize the bias term
b = theano.shared(0., name="b")
print("Initial model:")
print(w.get_value())
print(b.get_value())
# Construct Theano expression graph
p_1 = 1 / (1 + T.exp(-T.dot(x, w) - b)) # Probability that target = 1
prediction = p_1 > 0.5 # The prediction thresholded
xent = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # Cross-entropy loss function
cost = xent.mean() + 0.01 * (w ** 2).sum()# The cost to minimize
gw, gb = T.grad(cost, [w, b]) # Compute the gradient of the cost
# w.r.t weight vector w and bias term b (we shall return to this in a following section of this tutorial)

# Compile
train = theano.function( inputs=[x,y], outputs=[prediction, xent], updates=((w, w - 0.1 * gw), (b, b - 0.1 * gb)))
predict = theano.function(inputs=[x], outputs=prediction)

# Train
for i in range(training_steps):
pred, err = train(D[0], D[1])
print("Final model:")
print(w.get_value())
print(b.get_value())
print("target values for D:")
print(D[1])
print("prediction on D:")
print(predict(D[0]))

关于scan：不太好理解

大概参数说明

函数scan调用的一般形式的一个例子大概是这样：

results, updates = theano.scan(

fn = lambda y, p, x_tm2, x_tm1,A: y+p+x_tm2+xtm1+A,sequences=[Y, P[::-1]], outputs_info=[dict(initial=X, taps=[-2, -1])]),non_sequences=A)

参数fn是一个你需要计算的函数，一般用lambda来定义，参数是有顺序要求的，先是sequances的参数(y,p)，然后是output_info的参数(x_tm2,x_tm1)，然后是no_sequences的参数(A)。
sequences就是需要迭代的序列，序列的第一个维度(leading dimension)就是需要迭代的次数。所以，Y和P[::-1]的第一维大小应该相同，如果不同的话，就会取最小的。
outputs_info描述了需要用到前几次迭代输出的结果，dict(initial=X, taps=[-2, -1])表示使用前一次和前两次输出的结果。如果当前迭代输出为x(t)，则计算中使用了(x(t-1)和x(t-2)。
non_sequences描述了非序列的输入，即A是一个固定的输入，每次迭代加的A都是相同的。如果Y是一个向量，A就是一个常数，总之，A比Y少一个维度。

官网在引入scan时引入两个例子，计算雅各比矩阵和海森矩阵：

theano.gradient.jacobian()：

>>> import theano
>>> import theano.tensor as T
>>> x = T.dvector('x')
>>> y = x ** 2
>>> J, updates = theano.scan(lambda i, y,x : T.grad(y[i], x), sequences=T.arange(y.shape[0]), non_sequences=[y,x])
>>> f = theano.function([x], J, updates=updates)
>>> f([4, 4])
array([[ 8., 0.],
[ 0., 8.]])

theano.gradient.hessian()

>>> x = T.dvector('x')
>>> y = x ** 2
>>> cost = y.sum()
>>> gy = T.grad(cost, x)
>>> H, updates = theano.scan(lambda i, gy,x : T.grad(gy[i], x), sequences=T.arange(gy.shape[0]), non_sequences=[gy, x])
>>> f = theano.function([x], H, updates=updates)
>>> f([4, 4])
array([[ 2., 0.],
[ 0., 2.]])

Seeding Stream、Sharing Streams Between Functions、Copying Random State Between Theano Graphs

菜单 学习猿地 - LMONKEY

开通学习猿地VIP

尊享10项VIP特权 持续新增

知识通关挑战

打卡带练！告别无效练习

接私单赚外块

VIP优先接，累计金额超百万

学习猿地私房课免费学

大厂实战课仅对VIP开放

你的一对一导师

每月可免费咨询大牛30次

领取更多软件工程师实用特权

Java开发工程师

何以解忧，唯有 Java！

Python开发工程师

人生苦短我要学Python！

PHP开发工程师

PHP是世界上最好的编程语言!

GO开发工程师

想优雅的写程序，赶紧GO!

大数据开发工程师

弄大数据，就是在搞革命!

前端开发工程师

不仅最好，而且最全!

UI开发工程师

这个世界从来不缺少美，缺少你来创造美!

Linux运维工程师

不只是说说而已！

WEB前端1+X

不仅为考证，轻松做开发

计算机二级（C语言）

一切只为考证！

猿工手册

各种工具精挑细选

猿材料

各种工具精挑细选

猿代码

各种工具精挑细选

猿著课件

各种工具精挑细选

知识题库

知识闯关节节高，刷题涨知识！

阶段练习

单元测试知识学习状况秒掌握！

期末考试

期末测试，学习情况即刻知道！

面试题库

最新的企业技术人员招聘真题练习！

Java专区

2948篇 ｜ 24.5万人浏览

Python专区

2593篇 ｜ 19.5万人浏览

大数据专区

2948篇 ｜ 18.5万人浏览

PHP专区

3953篇 ｜ 27.5万人浏览

Go专区

1382篇 ｜ 12.5万人浏览

Web前端专区

1453篇 ｜ 31.5万人浏览

Linux云计算

1230篇 ｜ 4.5万人浏览

其他专区

53232篇文章 ｜ 11.5万人浏览

搜索

历史记录 清除记录

近期热搜

项目开发全程实录（电商EW_Shop）

8677 人 1年前

轻松学会Laravel-项目篇（商城API）

2022 人 1年前

Python数据分析2.0-金融

325 人 1年前

Python办公自动化2.0

241 人 1年前

使用账号登录

启用更安全省心的  微信登录

Java开发工程师

何以解忧，唯有 Java！

Python开发工程师

菜单学习猿地 - LMONKEY

尊享10项VIP特权持续新增

2948篇｜ 24.5万人浏览

2593篇｜ 19.5万人浏览

2948篇｜ 18.5万人浏览

3953篇｜ 27.5万人浏览

1382篇｜ 12.5万人浏览

1453篇｜ 31.5万人浏览

1230篇｜ 4.5万人浏览

53232篇文章｜ 11.5万人浏览

历史记录清除记录

2948篇｜ 24.5万人浏览

2593篇｜ 19.5万人浏览

2948篇｜ 18.5万人浏览

3953篇｜ 27.5万人浏览

1382篇｜ 12.5万人浏览

1453篇｜ 31.5万人浏览

1230篇｜ 4.5万人浏览

53232篇文章｜ 11.5万人浏览