SVD-python
Is a very clear, parsimonious implementation of the SVD.
It's practically psuedocode and should be fairly easy to understand
and compare/draw on for your php implementation, even if you don't know much python.
SVD-python
Yani diğerleri ben oldukça sınırlı bir web barındırma gibi geliyor ne php uygulama ile çok ağır LSA yapabilmek için beklemek olmaz belirttiğim gibi, dedi.
Şerefe
Edit:
The module above doesn't do anything all by itself, but there is an example included in the
opening comments. Assuming you downloaded the python module, and it was accessible (e.g. in the same folder), you
could implement a trivial example as follow,
#!/usr/bin/python
import svd
import math
a = [[22.,10., 2., 3., 7.],
[14., 7.,10., 0., 8.],
[-1.,13.,-1.,-11., 3.],
[-3.,-2.,13., -2., 4.],
[ 9., 8., 1., -2., 4.],
[ 9., 1.,-7., 5.,-1.],
[ 2.,-6., 6., 5., 1.],
[ 4., 5., 0., -2., 2.]]
u,w,vt = svd.svd(a)
print w
Here 'w' contains your list of singular values.
Of course this only gets you part of the way to latent semantic analysis and its relatives.
You usually want to reduce the number of singular values, then employ some appropriate distance
metric to measure the similarity between your documents, or words, or documents and words, etc.
The cosine of the angle between your resultant vectors is pretty popular.
Latent Semantic Mapping (pdf)
is by far the clearest, most concise and informative paper I've read on the remaining steps you
need to work out following the SVD.
Edit2: also note that if you're working with very large term-document matrices (I'm assuming this
is what you are doing) it is almost certainly going to be far more efficient to perform the decomposition
in an offline mode, and then perform only the comparisons in a live fashion in response to requests.
while svd-python is great for learning, the svdlibc is more what you would want for such heavy
computation.
finally as mentioned in the bellegarda paper above, remember that you don't have to recompute the
svd every single time you get a new document or request. depending on what you are trying to do you could
probably get away with performing the svd once every week or so, in an offline mode, a local machine,
and then uploading the results (size/bandwidth concerns notwithstanding).
Yine de iyi şanslar!