IRC log for #rest, 2015-12-17

https://trygvis.io/rest-wiki/

All times shown according to UTC.

Time	Nick	Message
00:10		fuzzyhorns joined #rest
00:20	pdurbin	got into a big argument about URLs (path vs. query parameters) and REST today
00:37	pdurbin	well, a difference of opinion at least
00:37	pdurbin	:)
00:40	pdurbin	I think what he was really making an argument for was pretty URLs: https://en.wikipedia.org/wiki/Semantic_URL
00:42	pdurbin	"Semantic URLs, by contrast, contain only the path of a resource"
01:03		fuzzyhorns joined #rest
01:10	* pdurbin	looks at http://programmers.stackexchange.com/questions/270898/designing-a-rest-api-by-uri-vs-query-string
01:25	pdurbin	so in this example...
01:25	pdurbin	response = client.post('http://api.figshare.com/v1/my_data/articles/92285/action/make_public', auth=oauth)
01:25	pdurbin	... what do you call 92285?
01:27		locks joined #rest
02:11	pdurbin	let's call it an identifier
02:13	impl	are we having a semantistential crisis?
02:30	pdurbin	yes
02:33	pdurbin	the identifier is part of the path which makes total sense when the identifier is an integer such as 92285 (probably a database id) but what if users know their dataset not by some internal database id but by an identifer that may have various numbers of slashes in it such as doi:10.5072/FK2/HSYEMH ... this doesn't work so well in a path
02:35	pdurbin	this doesn't work so well, I mean, because of the slashes (may be 2 slashes or 1 slash) in the identifier: client.post('http://example.com/v1/my_data/articles/doi:10.5072/FK2/HSYEMH/action/make_public')
02:35	pdurbin	right?
02:37	impl	i mean... i guess you could use %2f if you really cared? i'd either use a surrogate or just replace slashes with another character
02:38	impl	http://example.com/v1/my_data/articles/?doi=doi:10.1234/a/b/c -> 301 to /v1/my_data/articles/92285
02:38	pdurbin	well, the users might be grumpy about not being able to use the identifiers as is. they probably don't want to percent encode them or replace the slashes with some other character
02:40	pdurbin	impl: but would http://example.com/v1/my_data/articles/?doi=doi:10.1234/action/make_public make sense? probably not... there are other actions like "publish" or whatever
02:40	impl	no, they'd have to discover the proper root URL first
02:41	impl	i mean if you really want to make this hierarchical you're stuck with using the way that URIs define hierarchy
02:41	pdurbin	right. slashes are special
02:42	pdurbin	the point is that it's annoying when identifiers have slashes in them. you have to do tricks to put them in the path
02:42	impl	yeah
02:43	impl	(as an aside - DOI identifiers are indeed hierarchical, i just checked)
02:44	impl	the syntax is directory.registrant.subcode/uniqueid
02:44	pdurbin	yeah, and the reason why sometimes there's an extra slash is that the "shoulder" is optional
02:45	impl	pdurbin: in any case, i think the general line of thought on that sort of thing is that your client would subsitute the ID performing encoding as necessary
02:46	impl	like, most people probably aren't going to write it like you have it... they're going to do something like client.post('http://example.com/v1/my_data/articles/:id/action/make_public', {id: 'doi/10.5072/FK2/HSYEMH'})
02:46	impl	or even better if you provide a client, you can say something like client.for_article('doi/10.5072/FK2/HSYEMH').make_public()
02:47	pdurbin	impl: is that a literal ":id" string in that path?
02:47	impl	pdurbin: no, it'd be substituted by the client
02:48	pdurbin	but then I'd have to deal with variable numbers of slashes in the id
02:50	impl	oh, there's a whole RFC for this
02:50	impl	http://tools.ietf.org/html/rfc6570
02:51	pdurbin	which part?
02:51	impl	http://tools.ietf.org/html/rfc6570#section-3.2.6
02:52	impl	PHP's guzzle client supports this for instance
02:52	impl	http://guzzle3.readthedocs.org/http-client/uri-templates.html
02:53	pdurbin	so "{/list*} /red/green/blue" means variable number of slashes?
02:53	impl	in your case: $client->get(['http://example.com/v1/my_data/articles{/id}/action/make_public', ['id' => 'doi:10.5072/FK2/HSYEMH']]); would result in a request to http://example.com/v1/my_data/articles/doi:10.5072%2FFK2%2FHSYEMH/action/make_public
02:54	impl	yeah, it looks like that
02:54	pdurbin	hmm
02:54	impl	if you scroll it up it has the example variables defined
02:55	pdurbin	these guzzle examples are nice
02:57	pdurbin	of course the resulting url isn't percent encoded. one and two are separated by a normal slash: http://example.com/foo/bar/one/two?query=test&more=value
02:58	impl	yeah, it depends on which of the options you use
02:59	impl	stupid IRC client D:
02:59	pdurbin	meanwhile at https://github.com/CrossRef/rest-api-doc/issues/14#issuecomment-61156601 someone has posted the most awkward doi he knows of: http://api.crossref.org/works/10.1002%2F1521-4028%28200203%2942%3A1%3C55%3A%3AAID-JOBM55%3E3.0.CO%3B2-%23 . that's "DOI":"10.1002\/1521-4028(200203)42:1<55::aid-jobm55>3.0.co;2-#"
02:59	impl	try putting an '&' in one of the 'data' values -- it'll have to escape it
03:00	impl	that's ... unsettling
03:01	pdurbin	heh
03:02	pdurbin	so I've been thinking that these crazy identifiers with slashes (DOIs) shouldn't be in the path at all. they should be passes as query parameters instead. but now I'm wondering if it's ok to use query parameters with POST and PUT
03:02	pdurbin	passed*
03:03	impl	that is ok, yeah, but of course you still run into the same problem if a DOI contains an '&', right?
03:03	impl	i don't think there are any character restrictions on the suffix part
03:03	impl	and you have to escape '#' regardless
03:04	impl	tl;dr is DOI suffixes are not URL-safe
03:12	pdurbin	yeah a "&" in a query parameter would be bad news. would need to escape it
03:12	* pdurbin	shakes his fist
03:17		baweaver joined #rest
03:21	pdurbin	jeez, this doi is pretty ugly too: http://stackoverflow.com/questions/22775353/retrieving-from-a-url-with-many-reserved-characters-via-curl-or-httparty-from-ru
03:22	impl	i think you're best off just telling people to get over themselves and URL-encode it :-)
03:22	pdurbin	let me sleep on it but you may be right :)
03:23	pdurbin	here's the issue on our side: https://github.com/IQSS/dataverse/issues/1837
03:42		the_last joined #rest
03:43	the_last	what's better practice for deleting multiple items? DELETE /api/item?ids=1,2,3,4 or POST /api/item/delete and then [1,2,3,4] as the body
04:35		fuzzyhorns joined #rest
05:14	sfisque	pdurbin i had such a problem. i had to create a DELETE endpoint that handled an identifier with slashes in it. basically we decided to use a header parameter. i originally designed it to use a payload, but some client libs vomited on sending a payload for a DELETE, so we switched to a header
05:15	sfisque	my feeling was, sending a query parameter for a DEL was not clean, and having to do escape magic on a URI just invited all sorts of shenanigans
05:38		fuzzyhorns joined #rest
06:39		fuzzyhorns joined #rest
07:12		_ollie joined #rest
07:26	spaceone	URI's are just names. it doesn't matter if you use a clean human readable path or a query string. query strings are removing the ability to cache the resource in many cache implementations and they give knowledge about the concrete implementation (which is also a security risk). so path's should be preferred in general
07:27	spaceone	the problem that you can't easily put something with / into a path segment is that there are too many broken libraries out there
07:27	spaceone	this might work if you access the origin server directly but could fail if you go through an intermediary
07:29	spaceone	RFC 3986 doesn't explicit mention the preserving of %2f in the uri when doing normalization
07:31	spaceone	you will have the same problem even more with the query 'string' - you MUST always encode them properly - as well as the path - this is a task which every lib or protocol parse have to do
07:32	spaceone	and btw. you can put everything you want into the query string, it's nowhere standardizes that it must be application/x-www-form-urlencoded
07:33	spaceone	as far as i could trace this it is because HTML invented such thing
07:34	spaceone	so the problem is not the URI the problem are the bullshit libs out there which doesn't properly encode things or fail in the corner cases
07:34	spaceone	(btw. I wrote a library which doesn't fail :))
07:40		fuzzyhorns joined #rest
07:51	spaceone	(in python)
07:56	spaceone	regarding DELETE → I would send DELETE /item/1 then DELETE /item/2 ... or if it is important to bundle them maybe: PUT /transactions/items/removal
07:57	spaceone	(e.g. if you remove 32k entries the latter one)
08:13		timg___ joined #rest
08:41		fuzzyhorns joined #rest
09:05		baweaver joined #rest
09:09		AbuDhar joined #rest
09:13		chthon joined #rest
09:28		graste joined #rest
09:41		fuzzyhorns joined #rest
10:42		fuzzyhorns joined #rest
11:07		baweaver joined #rest
11:14	pdurbin	sfisque: interesting that you used a header parameter
11:15	pdurbin	spaceone: right, I don't have to worry about escaping slashes in a query parameter. just "&"
11:19	spaceone	pdurbin: URI normalization says that /a%2fb/ is equivalent to /a/b/
11:19	spaceone	so basically it can't be used for such cases^^
11:20	spaceone	it is "hierarchical" ^^
11:41		Macaveli joined #rest
11:42	pdurbin	hmm
11:43		fuzzyhorns joined #rest
11:47	pdurbin	so they're equivalent. makes sense, from what I've seen
11:50	spaceone	I would like to see this enhanced in the URI standard so that this becomes possible
12:02		Macaveli_ joined #rest
12:05		Macaveli joined #rest
12:23		Macaveli joined #rest
12:35		Macaveli joined #rest
12:42		Macaveli_ joined #rest
12:44		fuzzyhorns joined #rest
13:08		baweaver joined #rest
13:21		mezod joined #rest
14:12		fuzzyhorns joined #rest
14:14		fuzzyhor_ joined #rest
15:30	sfisque	spaceone the issue is the slashes were opaque. they dont indicate any hierarchy. it was an artifact of interfacing with a system that created pathological id keys for objects.
15:30	sfisque	so splitting the key and doing sub-deletes would have created irrational code
15:36		fuzzyhorns joined #rest
15:47	pdurbin	yeah
15:47	pdurbin	down with pathological id keys
15:48	pdurbin	"DOI names may incorporate any printable characters from the Universal Character Set" https://www.doi.org/doi_handbook/2_Numbering.html
15:48	pdurbin	I prefer integers as keys :)
15:53	trygvis	integers are nice if you're using sql
15:54	pdurbin	or uuids. just not DOIs :)
15:55	trygvis	uuid are bad for performance, they crush your indexes
15:55	pdurbin	oh
15:55	trygvis	uuid are nice as the client can create them
15:56	trygvis	this site is awesome: http://use-the-index-luke.com/
15:58	pdurbin	nice
16:00	trygvis	we always use long (64 bit) for our entities, but some have a uuid field too
16:04		wavded joined #rest
16:15		jgornick joined #rest
16:37		fuzzyhorns joined #rest
16:49		baweaver joined #rest
17:16		timg___ joined #rest
17:38		fuzzyhorns joined #rest
17:42		wavded joined #rest
17:56		_ollie joined #rest
18:39		fuzzyhorns joined #rest
18:41		Macaveli joined #rest
19:39		fuzzyhorns joined #rest
19:43		anth0ny joined #rest
20:40		fuzzyhorns joined #rest
21:29		Macaveli joined #rest
21:41		fuzzyhorns joined #rest
22:26		fuzzyhorns joined #rest
23:21		fuzzyhorns joined #rest
23:22		vanHoesel joined #rest

https://trygvis.io/rest-wiki/