Time |
S |
Nick |
Message |
00:10 |
|
|
fuzzyhorns joined #rest |
00:20 |
|
pdurbin |
got into a big argument about URLs (path vs. query parameters) and REST today |
00:37 |
|
pdurbin |
well, a difference of opinion at least |
00:37 |
|
pdurbin |
:) |
00:40 |
|
pdurbin |
I think what he was really making an argument for was pretty URLs: https://en.wikipedia.org/wiki/Semantic_URL |
00:42 |
|
pdurbin |
"Semantic URLs, by contrast, contain only the path of a resource" |
01:03 |
|
|
fuzzyhorns joined #rest |
01:10 |
|
* pdurbin |
looks at http://programmers.stackexchange.com/questions/270898/designing-a-rest-api-by-uri-vs-query-string |
01:25 |
|
pdurbin |
so in this example... |
01:25 |
|
pdurbin |
response = client.post('http://api.figshare.com/v1/my_data/articles/92285/action/make_public', auth=oauth) |
01:25 |
|
pdurbin |
... what do you call 92285? |
01:27 |
|
|
locks joined #rest |
02:11 |
|
pdurbin |
let's call it an identifier |
02:13 |
|
impl |
are we having a semantistential crisis? |
02:30 |
|
pdurbin |
yes |
02:33 |
|
pdurbin |
the identifier is part of the path which makes total sense when the identifier is an integer such as 92285 (probably a database id) but what if users know their dataset not by some internal database id but by an identifer that may have various numbers of slashes in it such as doi:10.5072/FK2/HSYEMH ... this doesn't work so well in a path |
02:35 |
|
pdurbin |
this doesn't work so well, I mean, because of the slashes (may be 2 slashes or 1 slash) in the identifier: client.post('http://example.com/v1/my_data/articles/doi:10.5072/FK2/HSYEMH/action/make_public') |
02:35 |
|
pdurbin |
right? |
02:37 |
|
impl |
i mean... i guess you could use %2f if you really cared? i'd either use a surrogate or just replace slashes with another character |
02:38 |
|
impl |
http://example.com/v1/my_data/articles/?doi=doi:10.1234/a/b/c -> 301 to /v1/my_data/articles/92285 |
02:38 |
|
pdurbin |
well, the users might be grumpy about not being able to use the identifiers as is. they probably don't want to percent encode them or replace the slashes with some other character |
02:40 |
|
pdurbin |
impl: but would http://example.com/v1/my_data/articles/?doi=doi:10.1234/action/make_public make sense? probably not... there are other actions like "publish" or whatever |
02:40 |
|
impl |
no, they'd have to discover the proper root URL first |
02:41 |
|
impl |
i mean if you really want to make this hierarchical you're stuck with using the way that URIs define hierarchy |
02:41 |
|
pdurbin |
right. slashes are special |
02:42 |
|
pdurbin |
the point is that it's annoying when identifiers have slashes in them. you have to do tricks to put them in the path |
02:42 |
|
impl |
yeah |
02:43 |
|
impl |
(as an aside - DOI identifiers are indeed hierarchical, i just checked) |
02:44 |
|
impl |
the syntax is directory.registrant.subcode/uniqueid |
02:44 |
|
pdurbin |
yeah, and the reason why sometimes there's an extra slash is that the "shoulder" is optional |
02:45 |
|
impl |
pdurbin: in any case, i think the general line of thought on that sort of thing is that your client would subsitute the ID performing encoding as necessary |
02:46 |
|
impl |
like, most people probably aren't going to write it like you have it... they're going to do something like client.post('http://example.com/v1/my_data/articles/:id/action/make_public', {id: 'doi/10.5072/FK2/HSYEMH'}) |
02:46 |
|
impl |
or even better if you provide a client, you can say something like client.for_article('doi/10.5072/FK2/HSYEMH').make_public() |
02:47 |
|
pdurbin |
impl: is that a literal ":id" string in that path? |
02:47 |
|
impl |
pdurbin: no, it'd be substituted by the client |
02:48 |
|
pdurbin |
but then I'd have to deal with variable numbers of slashes in the id |
02:50 |
|
impl |
oh, there's a whole RFC for this |
02:50 |
|
impl |
http://tools.ietf.org/html/rfc6570 |
02:51 |
|
pdurbin |
which part? |
02:51 |
|
impl |
http://tools.ietf.org/html/rfc6570#section-3.2.6 |
02:52 |
|
impl |
PHP's guzzle client supports this for instance |
02:52 |
|
impl |
http://guzzle3.readthedocs.org/http-client/uri-templates.html |
02:53 |
|
pdurbin |
so "{/list*} /red/green/blue" means variable number of slashes? |
02:53 |
|
impl |
in your case: $client->get(['http://example.com/v1/my_data/articles{/id}/action/make_public', ['id' => 'doi:10.5072/FK2/HSYEMH']]); would result in a request to http://example.com/v1/my_data/articles/doi:10.5072%2FFK2%2FHSYEMH/action/make_public |
02:54 |
|
impl |
yeah, it looks like that |
02:54 |
|
pdurbin |
hmm |
02:54 |
|
impl |
if you scroll it up it has the example variables defined |
02:55 |
|
pdurbin |
these guzzle examples are nice |
02:57 |
|
pdurbin |
of course the resulting url isn't percent encoded. one and two are separated by a normal slash: http://example.com/foo/bar/one/two?query=test&more=value |
02:58 |
|
impl |
yeah, it depends on which of the options you use |
02:59 |
|
impl |
stupid IRC client D: |
02:59 |
|
pdurbin |
meanwhile at https://github.com/CrossRef/rest-api-doc/issues/14#issuecomment-61156601 someone has posted the most awkward doi he knows of: http://api.crossref.org/works/10.1002%2F1521-4028%28200203%2942%3A1%3C55%3A%3AAID-JOBM55%3E3.0.CO%3B2-%23 . that's "DOI":"10.1002\/1521-4028(200203)42:1<55::aid-jobm55>3.0.co;2-#" |
02:59 |
|
impl |
try putting an '&' in one of the 'data' values -- it'll have to escape it |
03:00 |
|
impl |
that's ... unsettling |
03:01 |
|
pdurbin |
heh |
03:02 |
|
pdurbin |
so I've been thinking that these crazy identifiers with slashes (DOIs) shouldn't be in the path at all. they should be passes as query parameters instead. but now I'm wondering if it's ok to use query parameters with POST and PUT |
03:02 |
|
pdurbin |
passed* |
03:03 |
|
impl |
that is ok, yeah, but of course you still run into the same problem if a DOI contains an '&', right? |
03:03 |
|
impl |
i don't think there are any character restrictions on the suffix part |
03:03 |
|
impl |
and you have to escape '#' regardless |
03:04 |
|
impl |
tl;dr is DOI suffixes are not URL-safe |
03:12 |
|
pdurbin |
yeah a "&" in a query parameter would be bad news. would need to escape it |
03:12 |
|
* pdurbin |
shakes his fist |
03:17 |
|
|
baweaver joined #rest |
03:21 |
|
pdurbin |
jeez, this doi is pretty ugly too: http://stackoverflow.com/questions/22775353/retrieving-from-a-url-with-many-reserved-characters-via-curl-or-httparty-from-ru |
03:22 |
|
impl |
i think you're best off just telling people to get over themselves and URL-encode it :-) |
03:22 |
|
pdurbin |
let me sleep on it but you may be right :) |
03:23 |
|
pdurbin |
here's the issue on our side: https://github.com/IQSS/dataverse/issues/1837 |
03:42 |
|
|
the_last joined #rest |
03:43 |
|
the_last |
what's better practice for deleting multiple items? DELETE /api/item?ids=1,2,3,4 or POST /api/item/delete and then [1,2,3,4] as the body |
04:35 |
|
|
fuzzyhorns joined #rest |
05:14 |
|
sfisque |
pdurbin i had such a problem. i had to create a DELETE endpoint that handled an identifier with slashes in it. basically we decided to use a header parameter. i originally designed it to use a payload, but some client libs vomited on sending a payload for a DELETE, so we switched to a header |
05:15 |
|
sfisque |
my feeling was, sending a query parameter for a DEL was not clean, and having to do escape magic on a URI just invited all sorts of shenanigans |
05:38 |
|
|
fuzzyhorns joined #rest |
06:39 |
|
|
fuzzyhorns joined #rest |
07:12 |
|
|
_ollie joined #rest |
07:26 |
|
spaceone |
URI's are just names. it doesn't matter if you use a clean human readable path or a query string. query strings are removing the ability to cache the resource in many cache implementations and they give knowledge about the concrete implementation (which is also a security risk). so path's should be preferred in general |
07:27 |
|
spaceone |
the problem that you can't easily put something with / into a path segment is that there are too many broken libraries out there |
07:27 |
|
spaceone |
this might work if you access the origin server directly but could fail if you go through an intermediary |
07:29 |
|
spaceone |
RFC 3986 doesn't explicit mention the preserving of %2f in the uri when doing normalization |
07:31 |
|
spaceone |
you will have the same problem even more with the query 'string' - you MUST always encode them properly - as well as the path - this is a task which every lib or protocol parse have to do |
07:32 |
|
spaceone |
and btw. you can put everything you want into the query string, it's nowhere standardizes that it must be application/x-www-form-urlencoded |
07:33 |
|
spaceone |
as far as i could trace this it is because HTML invented such thing |
07:34 |
|
spaceone |
so the problem is not the URI the problem are the bullshit libs out there which doesn't properly encode things or fail in the corner cases |
07:34 |
|
spaceone |
(btw. I wrote a library which doesn't fail :)) |
07:40 |
|
|
fuzzyhorns joined #rest |
07:51 |
|
spaceone |
(in python) |
07:56 |
|
spaceone |
regarding DELETE → I would send DELETE /item/1 then DELETE /item/2 ... or if it is important to bundle them maybe: PUT /transactions/items/removal |
07:57 |
|
spaceone |
(e.g. if you remove 32k entries the latter one) |
08:13 |
|
|
timg___ joined #rest |
08:41 |
|
|
fuzzyhorns joined #rest |
09:05 |
|
|
baweaver joined #rest |
09:09 |
|
|
AbuDhar joined #rest |
09:13 |
|
|
chthon joined #rest |
09:28 |
|
|
graste joined #rest |
09:41 |
|
|
fuzzyhorns joined #rest |
10:42 |
|
|
fuzzyhorns joined #rest |
11:07 |
|
|
baweaver joined #rest |
11:14 |
|
pdurbin |
sfisque: interesting that you used a header parameter |
11:15 |
|
pdurbin |
spaceone: right, I don't have to worry about escaping slashes in a query parameter. just "&" |
11:19 |
|
spaceone |
pdurbin: URI normalization says that /a%2fb/ is equivalent to /a/b/ |
11:19 |
|
spaceone |
so basically it can't be used for such cases^^ |
11:20 |
|
spaceone |
it is "hierarchical" ^^ |
11:41 |
|
|
Macaveli joined #rest |
11:42 |
|
pdurbin |
hmm |
11:43 |
|
|
fuzzyhorns joined #rest |
11:47 |
|
pdurbin |
so they're equivalent. makes sense, from what I've seen |
11:50 |
|
spaceone |
I would like to see this enhanced in the URI standard so that this becomes possible |
12:02 |
|
|
Macaveli_ joined #rest |
12:05 |
|
|
Macaveli joined #rest |
12:23 |
|
|
Macaveli joined #rest |
12:35 |
|
|
Macaveli joined #rest |
12:42 |
|
|
Macaveli_ joined #rest |
12:44 |
|
|
fuzzyhorns joined #rest |
13:08 |
|
|
baweaver joined #rest |
13:21 |
|
|
mezod joined #rest |
14:12 |
|
|
fuzzyhorns joined #rest |
14:14 |
|
|
fuzzyhor_ joined #rest |
15:30 |
|
sfisque |
spaceone the issue is the slashes were opaque. they dont indicate any hierarchy. it was an artifact of interfacing with a system that created pathological id keys for objects. |
15:30 |
|
sfisque |
so splitting the key and doing sub-deletes would have created irrational code |
15:36 |
|
|
fuzzyhorns joined #rest |
15:47 |
|
pdurbin |
yeah |
15:47 |
|
pdurbin |
down with pathological id keys |
15:48 |
|
pdurbin |
"DOI names may incorporate any printable characters from the Universal Character Set" https://www.doi.org/doi_handbook/2_Numbering.html |
15:48 |
|
pdurbin |
I prefer integers as keys :) |
15:53 |
|
trygvis |
integers are nice if you're using sql |
15:54 |
|
pdurbin |
or uuids. just not DOIs :) |
15:55 |
|
trygvis |
uuid are bad for performance, they crush your indexes |
15:55 |
|
pdurbin |
oh |
15:55 |
|
trygvis |
uuid are nice as the client can create them |
15:56 |
|
trygvis |
this site is awesome: http://use-the-index-luke.com/ |
15:58 |
|
pdurbin |
nice |
16:00 |
|
trygvis |
we always use long (64 bit) for our entities, but some have a uuid field too |
16:04 |
|
|
wavded joined #rest |
16:15 |
|
|
jgornick joined #rest |
16:37 |
|
|
fuzzyhorns joined #rest |
16:49 |
|
|
baweaver joined #rest |
17:16 |
|
|
timg___ joined #rest |
17:38 |
|
|
fuzzyhorns joined #rest |
17:42 |
|
|
wavded joined #rest |
17:56 |
|
|
_ollie joined #rest |
18:39 |
|
|
fuzzyhorns joined #rest |
18:41 |
|
|
Macaveli joined #rest |
19:39 |
|
|
fuzzyhorns joined #rest |
19:43 |
|
|
anth0ny joined #rest |
20:40 |
|
|
fuzzyhorns joined #rest |
21:29 |
|
|
Macaveli joined #rest |
21:41 |
|
|
fuzzyhorns joined #rest |
22:26 |
|
|
fuzzyhorns joined #rest |
23:21 |
|
|
fuzzyhorns joined #rest |
23:22 |
|
|
vanHoesel joined #rest |