Doing Kmeans cluster analysis, how to I manually define a certain cluster-center? For example I want to say my cluster centers are [1,2,3] and [3,4,5] and now I want to cluster my vectors to the predefined centers.
something like kmeans.cluster_centers_ = [[1,2,3],[3,4,5]]
?
to work around my problem thats what I do atm:
number_of_clusters = len(vec)
kmeans = KMeans(number_of_clusters, init='k-means++', n_init=100)
kmeans.fit(vec)
it basically defines a cluster for each vector. But it takes ages to compute as I have thousands of vectors/sentences. There must be an option to set the vector coordinates directly as cluster coordinates without the need to compute them with the kmeans algorithm. (as the center outputs are basically the vector coordinates after i run the algorithm...)
Edit to be more specific about my task:
So what I do want is I have tonns of vectors ( generated from sentences) and now I want to cluster these. But imagine I have two columns of sentences and always want to sort a B column sentence to an A column sentence. Not A column sentences to each other. Thats why I want to set cluster centers for the A column vectors and afterwards predict the clostest B vectors to these Centers. Hope that makes sense
I am using sklearn kmeans atm