Time |
S |
Nick |
Message |
10:41 |
|
|
dotplus joined #sourcefu |
10:41 |
|
|
dotplus joined #sourcefu |
10:58 |
|
|
dotplus joined #sourcefu |
11:02 |
|
dotplus |
bear: is sleekxmpp still considered to be a/the Right Way to bulid xmpp clients/bots? |
11:11 |
|
dotplus |
in Python, that is:) |
11:22 |
|
|
copyit joined #sourcefu |
12:18 |
|
|
copyit joined #sourcefu |
12:26 |
|
|
copyit joined #sourcefu |
12:33 |
|
|
copyit joined #sourcefu |
12:43 |
|
|
copyit joined #sourcefu |
14:12 |
|
pdurbin |
Huh, apparently https://github.com/pdurbin.atom is the new https://github.com/pdurbin?tab=activity (whic doesn't work anymore). See also http://stackoverflow.com/questions/9128049/view-entire-activities-of-an-user-in-github#comment66602673_9128958 |
14:19 |
|
pdurbin |
The problem is that that Atom link doesn't render in Chrome. You just see the XML. It does render find in Firefox at least. |
14:20 |
|
pdurbin |
But I don't particularly want to install Firefox on my Android phone. I'm fine with the default browser (Chrome). |
14:21 |
|
pdurbin |
huh, "It shows the XML code, unformatted" 84 - RSS or Atom support needed - chromium - Monorail - https://bugs.chromium.org/p/chromium/issues/detail?id=84 |
14:23 |
|
pdurbin |
https://bugs.chromium.org/p/chromium/issues/detail?id=84#c149 says the bug can be fixed by installing https://chrome.google.com/extensions/detail/nlbjncdgjeocebhnmkbbbdekmmmcbfjd |
14:25 |
|
pdurbin |
But apparently I can't install that extension on Android. |
14:48 |
|
pdurbin |
ooh, `.mode line` in sqlite is nice: https://www.sqlite.org/cli.html |
14:59 |
|
pdurbin |
To back up a bit, this is the question that was just asked: One of the questions is to find the state with the most counties in it from the census data in USA http://www.census.gov/popest/data/counties/totals/2015/files/CO-EST2015-alldata.csv |
14:59 |
|
pdurbin |
over at https://gitter.im/pydata/pandas?at=58318fd0a5bc784f5658023f |
14:59 |
|
pdurbin |
How would people in this channel get the answer? |
14:59 |
|
pdurbin |
aditsu bear codex dotplus prologic semiosis sivoais tumdedum westmaas ^^ |
15:00 |
|
aditsu |
huh? |
15:00 |
|
pdurbin |
Lately I've been thinking I should learn SQL better: http://irclog.greptilian.com/sourcefu/2016-11-14 |
15:01 |
|
pdurbin |
aditsu: how would you find the state with the most counties in it based on that csv file? |
15:01 |
|
aditsu |
let me see the file.. |
15:02 |
|
aditsu |
I got nxdomain for www.census.gov o_O |
15:02 |
|
pdurbin |
aditsu: can you grab it from http://tmp.greptilian.com/tmp/data/CO-EST2015-alldata.csv ? |
15:03 |
|
aditsu |
(but dig says servfail) |
15:03 |
|
aditsu |
yes, that one worked |
15:04 |
|
pdurbin |
cool. lemme know your approach |
15:04 |
|
aditsu |
any language requirement? or algorithm question? |
15:04 |
|
pdurbin |
nope. just get the answer |
15:04 |
|
pdurbin |
which state and how many counties |
15:05 |
|
aditsu |
they're grouped by state, so I can just check sequentially |
15:06 |
|
pdurbin |
but with what tool? I'm using sqlite |
15:06 |
|
aditsu |
brb |
15:08 |
|
aditsu |
so, my first idea is libreoffice calc, but I don't know the functions well enough; 2nd idea is CJam :) |
15:08 |
|
aditsu |
I can probably do it in a couple of minutes |
15:10 |
|
aditsu |
[[1 "STNAME"] [68 "Alabama"] [30 "Alaska"] [16 "Arizona"] [76 "Arkansas"] [59 "California"] [65 "Colorado"] [9 "Connecticut"] [4 "Delaware"] [2 "District of Columbia"] [68 "Florida"] [160 "Georgia"] [6 "Hawaii"] [45 "Idaho"] [103 "Illinois"] [93 "Indiana"] [100 "Iowa"] [106 "Kansas"] [121 "Kentucky"] [65 "Louisiana"] [17 "Maine"] [25 "Maryland"] [15 "Massachusetts"] [84 "Michigan"] [88... |
15:10 |
|
aditsu |
..."Minnesota"] [83 "Mississippi"] [116 "Missouri"] [57 "Montana"] [94 "Nebraska"] [18 "Nevada"] [11 "New Hampshire"] [22 "New Jersey"] [34 "New Mexico"] [63 "New York"] [101 "North Carolina"] [54 "North Dakota"] [89 "Ohio"] [78 "Oklahoma"] [37 "Oregon"] [68 "Pennsylvania"] [6 "Rhode Island"] [47 "South Carolina"] [67 "South Dakota"] [96 "Tennessee"] [255 "Texas"] [30 "Utah"] [15 "Vermont"]... |
15:10 |
|
aditsu |
...[134 "Virginia"] [40 "Washington"] [56 "West Virginia"] [73 "Wisconsin"] [24 "Wyoming"] [1 ""]] |
15:11 |
|
aditsu |
oops, a bit too long :p |
15:11 |
|
pdurbin |
maybe I should start a "katas" area for #sourcefu like I did for #crimsonfu: https://github.com/crimsonfu/code/tree/master/katas |
15:11 |
|
aditsu |
result: [255 "Texas"] |
15:11 |
|
aditsu |
full code: qN/',f/5f=e`{0=}$W=p |
15:12 |
|
pdurbin |
aditsu: cool, then you just need to sort and pick the biggest |
15:12 |
|
aditsu |
yeah, just did |
15:12 |
|
pdurbin |
ah, a bit of lag |
15:12 |
|
aditsu |
I can probably make it a bit shorter |
15:12 |
|
pdurbin |
my sqlite solution: select stname,count(ctyname) from census group by stname order by count(ctyname) desc limit 1; |
15:13 |
|
aditsu |
that works |
15:13 |
|
pdurbin |
aditsu: is that CJam? |
15:13 |
|
aditsu |
yes |
15:13 |
|
pdurbin |
it's a little cryptic :) |
15:14 |
|
aditsu |
yeah.. it's a golfing language so not very readable, but very concise :) |
15:14 |
|
pdurbin |
maybe sivoais can come up with something shorter in Perl :) |
15:14 |
|
pdurbin |
I'm not very into golfing myself. |
15:14 |
|
aditsu |
updated a bit: qN%',f/5f=e`$W=p |
15:15 |
|
aditsu |
I doubt perl can get shorter |
15:15 |
|
pdurbin |
me neither |
15:15 |
|
pdurbin |
aditsu: where the bit where you read in the file? |
15:15 |
|
aditsu |
"q" reads the whole file |
15:15 |
|
aditsu |
as a string |
15:16 |
|
pdurbin |
ok |
15:16 |
|
aditsu |
the key part is "e`" which does RLE compression |
15:20 |
|
aditsu |
can sqlite query a csv file directly? |
15:20 |
|
aditsu |
if not, there's http://harelba.github.io/q/ |
15:21 |
|
pdurbin |
aditsu: yeah, you just do `.import CO-EST2015-alldata.csv census` or whatever |
15:21 |
|
aditsu |
ah, there's an import step |
15:21 |
|
pdurbin |
oh, `.mode csv` first. see "CSV Import" at https://www.sqlite.org/cli.html |
15:25 |
|
pdurbin |
aditsu: have you used `q` and if so, do you like it? |
15:25 |
|
aditsu |
I don't remember if I actually tried it :p |
15:26 |
|
pdurbin |
ok, it's a neat idea |
15:26 |
|
aditsu |
I have a q command on this machine, but it's a different thing |
15:26 |
|
pdurbin |
I don't mind the import step. I like that sqlite is everywhere. |
15:26 |
|
pdurbin |
maybe I'll try to figure out how to get the answer in R |
15:27 |
|
aditsu |
I haven't used sqlite much |
15:29 |
|
pdurbin |
aditsu: I'm surprised you haven't mentioned Depeche yet. |
15:30 |
|
aditsu |
haha, I was thinking about it :) I have some code that enables it to use csv files directly, but it needs some more work |
15:31 |
|
aditsu |
also I haven't implemented stuff like group by (with count) |
15:32 |
|
aditsu |
of course, if you load the csv into a database, you can use that, but then you can do it manually in sql without java code |
15:33 |
|
pdurbin |
aditsu: our answer is wrong! |
15:33 |
|
aditsu |
wat? |
15:34 |
|
aditsu |
is it 254? |
15:35 |
|
aditsu |
the answer is correct, the input is wrong :D |
15:36 |
|
pdurbin |
time to fix our code |
15:36 |
|
aditsu |
the code is perfectly fine based on the problem description |
15:37 |
|
pdurbin |
nope |
15:37 |
|
pdurbin |
leave it to my wife to actually look at the data :) |
15:37 |
|
pdurbin |
aditsu: go fix your code. I'll try to fix mine. |
15:38 |
|
aditsu |
yes it is fine, the data is wrong |
15:38 |
|
aditsu |
I also noticed that but I initially thought they have counties with the same name |
15:41 |
|
aditsu |
so just need to subtract 1 to compensate for the wrong input |
15:43 |
|
aditsu |
On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" |
15:43 |
|
aditsu |
I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. |
16:04 |
|
pdurbin |
ok, this fixes it: select stname,count(ctyname) from census where county != '000' group by stname order by count(ctyname) desc limit 1; |
16:04 |
|
aditsu |
you could have just added a "-1" |
16:06 |
|
pdurbin |
where? |
16:06 |
|
aditsu |
count(ctyname)-1 |
16:06 |
|
aditsu |
in the select part |
16:08 |
|
pdurbin |
ah. thanks. yes, this works: select stname,count(ctyname)-1 from census group by stname order by count(ctyname) desc limit 1; |
16:26 |
|
|
aditsu_phone joined #sourcefu |
16:56 |
|
pdurbin |
ah, https://www.census.gov/popest/data/counties/totals/2015/files/CO-EST2015-alldata.pdf is awfully helpful |
16:56 |
|
pdurbin |
via https://www.census.gov/popest/data/counties/totals/2015/CO-EST2015-alldata.html |
16:56 |
|
pdurbin |
(the first hit for the csv file name) |
17:04 |
|
pdurbin |
this is interesting: select stname,division from census where sumlev = '040' order by division; |
17:05 |
|
pdurbin |
Ohio (where I grew up) is considered "East North Central". I thought I was from the Midwest. :) |
17:08 |
|
|
AndChat|264089 joined #sourcefu |
17:13 |
|
pdurbin |
I've forgotton all of my R, sadly. |
17:14 |
|
pdurbin |
I should beef up my notes at http://wiki.greptilian.com/r |
17:21 |
|
pdurbin |
ah, according to https://gitter.im/pydata/pandas?at=5831db4b3418b2e57f2ba695 and http://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html a solution in pandas is this: df.groupby('STNAME').size().idxmax() |
17:25 |
|
pdurbin |
I bet it doesn't account for SUMLEV though. |
17:55 |
|
pdurbin |
Yeah, the revised pandas answer: df[df['SUMLEV']==50].groupby('STNAME').size().idxmax() |
18:16 |
|
pdurbin |
oh, I am still from the midwest: select stname,region from census where sumlev = '040' and region = 2; |
18:16 |
|
pdurbin |
I didn't notice "region". :) |