[Nanocubes-discuss] Could I solve this problem using nanocubes?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Nanocubes-discuss] Could I solve this problem using nanocubes?

李森栋
Execuse me, 
I need to make a system, to analyse twitter's hot topic in real time.
I got about 500 tweets per minutes, that is, about 700, 000 tweets every day.
I need to establish reverse index for every word, so users could search a word to get the word's tags cloud along the timeline.
Like this, the line chart means each emotion's proportion, each point corresponds to an hour's tweets.
Below the chart is a form contains the tags cloud in this hour(we could click the line chart's point to get that hour's tags cloud)
the tag cloud is calculated by statistic that hour's all tweets to get which words appears most frequently.
so could I realize this system using nanocubes? how to calculate the tag clouds in real time, and how may I decrease the search delay?




Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Nanocubes-discuss] Could I solve this problem using nanocubes?

Carlos Scheidegger
There's no support for per-word reverse indices, so I don't think a nanocube is the best fit for your needs here..

Best,
-carlos

On Jan 9, 2014, at 10:29 PM, 李森栋 <[hidden email]> wrote:

> Execuse me,
> I need to make a system, to analyse twitter's hot topic in real time.
> I got about 500 tweets per minutes, that is, about 700, 000 tweets every day.
> I need to establish reverse index for every word, so users could search a word to get the word's tags cloud along the timeline.
> Like this, the line chart means each emotion's proportion, each point corresponds to an hour's tweets.
> Below the chart is a form contains the tags cloud in this hour(we could click the line chart's point to get that hour's tags cloud)
> the tag cloud is calculated by statistic that hour's all tweets to get which words appears most frequently.
> so could I realize this system using nanocubes? how to calculate the tag clouds in real time, and how may I decrease the search delay?
>
> <截图1.png>
>
>
>
> _______________________________________________
> Nanocubes-discuss mailing list
> [hidden email]
> http://mailman.nanocubes.net/mailman/listinfo/nanocubes-discuss_mailman.nanocubes.net



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Nanocubes-discuss] Could I solve this problem using nanocubes?

李森栋
Thank you very much for your reply.
but I only need to build 40000 words' reverse index, and I could make reverse indices by myself.
I want to build a nanocube for every word.
Could I build 40000+ nanocubes simultaneously?
 





At 2014-01-12 06:21:20,"Carlos Scheidegger" <[hidden email]> wrote: >There's no support for per-word reverse indices, so I don't think a nanocube is the best fit for your needs here.. > >Best, >-carlos > >On Jan 9, 2014, at 10:29 PM, 李森栋 <[hidden email]> wrote: > >> Execuse me,  >> I need to make a system, to analyse twitter's hot topic in real time. >> I got about 500 tweets per minutes, that is, about 700, 000 tweets every day. >> I need to establish reverse index for every word, so users could search a word to get the word's tags cloud along the timeline. >> Like this, the line chart means each emotion's proportion, each point corresponds to an hour's tweets. >> Below the chart is a form contains the tags cloud in this hour(we could click the line chart's point to get that hour's tags cloud) >> the tag cloud is calculated by statistic that hour's all tweets to get which words appears most frequently. >> so could I realize this system using nanocubes? how to calculate the tag clouds in real time, and how may I decrease the search delay? >>  >> <截图1.png> >>  >>  >>  >> _______________________________________________ >> Nanocubes-discuss mailing list >> [hidden email] >> http://mailman.nanocubes.net/mailman/listinfo/nanocubes-discuss_mailman.nanocubes.net > > >_______________________________________________ >Nanocubes-discuss mailing list >[hidden email] >http://mailman.nanocubes.net/mailman/listinfo/nanocubes-discuss_mailman.nanocubes.net


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Nanocubes-discuss] Could I solve this problem using nanocubes?

Carlos Scheidegger
Each nanocube is a process in memory that opens a TCP port. It's unlikely that you could do it straightforwardly.

If you *really* wanted to do it, you could split your word list in chunks of 256 words (which is the cardinality limit of "categorical data" in nanocubes), and then run one nanocube for each 256 word list. That would give you 157 nanocubes, which *should* actually be possible. You might need a lot of memory, and very possibly more than one machine.

-carlos

On Jan 14, 2014, at 7:22 AM, 李森栋 <[hidden email]> wrote:

Thank you very much for your reply.
but I only need to build 40000 words' reverse index, and I could make reverse indices by myself.
I want to build a nanocube for every word.
Could I build 40000+ nanocubes simultaneously?
 





At 2014-01-12 06:21:20,"Carlos Scheidegger" <[hidden email]> wrote: >There's no support for per-word reverse indices, so I don't think a nanocube is the best fit for your needs here.. > >Best, >-carlos > >On Jan 9, 2014, at 10:29 PM, 李森栋 <[hidden email]> wrote: > >> Execuse me,  >> I need to make a system, to analyse twitter's hot topic in real time. >> I got about 500 tweets per minutes, that is, about 700, 000 tweets every day. >> I need to establish reverse index for every word, so users could search a word to get the word's tags cloud along the timeline. >> Like this, the line chart means each emotion's proportion, each point corresponds to an hour's tweets. >> Below the chart is a form contains the tags cloud in this hour(we could click the line chart's point to get that hour's tags cloud) >> the tag cloud is calculated by statistic that hour's all tweets to get which words appears most frequently. >> so could I realize this system using nanocubes? how to calculate the tag clouds in real time, and how may I decrease the search delay? >>  >> <截图1.png> >>  >>  >>  >> _______________________________________________ >> Nanocubes-discuss mailing list >> [hidden email] >> http://mailman.nanocubes.net/mailman/listinfo/nanocubes-discuss_mailman.nanocubes.net > > >_______________________________________________ >Nanocubes-discuss mailing list >[hidden email] >http://mailman.nanocubes.net/mailman/listinfo/nanocubes-discuss_mailman.nanocubes.net



Loading...