stream.hls: parse M3U8 from Response obj directly #4552
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
str
andrequests.Response
inM3U8Parser.parse
https://datatracker.ietf.org/doc/html/rfc8216#section-9
Passing the HTTP Response object to the parser and iterating its content prevents having to keep a copy of the entire response content in memory. Support for
str
is kept for backwards compatibility and because some tests rely on it and would need to get rewritten.The content is now also always read as UTF-8, as defined by RFC 8216. This was previously guessed by
chardet
/charset_normalizer
if no HTTP response headers were set (see #4329) and it required custom overrides if there were issues while figuring out the unknown encodings. All HLS tests have been using implicit utf-8 encoding the entire time. We could add tests for invalid encodings in the future, but I don't think it's important.I have a couple more changes planned for the
HLSStream
+ andM3U8
+ implementations. But that's not ready yet and unrelated to this PR. Just want to quickly talk about that.One of the changes is using
typing.Generic
+typing.TypeVar
for making it easier to subclass HLS streams, the parser and other stuff, without having to suppress invalid type informations or method signatures (see the Twitch plugin for example). Using dataclasses instead of named tuples is also one of the things which will make subclassing/extending HLS logic easier.Another change is passing the parsed master playlist object to the HLSStreams created by
parse_variant_playlist
, so additional data can be read by the media playlists and their parsers. It currently only adds the master playlist URL to the{Muxed,}HLSStream
for theto_manifest_url
method (I also want to change this interface eventually, but that may be a breaking change). I noticed the master playlist issue while rewriting Twitch's low latency stuff two days ago, because there's metadata in the master playlist that should be read by the media playlist. It's also relevant forEXT-X-SESSION-DATA
andEXT-X-SESSION-KEY
, which are currently not implemented.