Skip to content

plugins.htv: new plugin #4431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 6, 2022
Merged

plugins.htv: new plugin #4431

merged 1 commit into from
Apr 6, 2022

Conversation

mkbloke
Copy link
Member

@mkbloke mkbloke commented Apr 4, 2022

Nobody commented on #4225 (comment), so...

URLs:

$ streamlink htv.com.vn/truc-tuyen # defaults to first channel in list
$ streamlink htv.com.vn/truc-tuyen?channel=n # where n is valid channel ID

Channel IDs are discovered from the player page and can be shown in debug output:

$ streamlink --http-proxy 'socks4://localhost:9050' -l debug 'htv.com.vn/truc-tuyen'
[cli][debug] OS:         Linux-5.4.0-107-generic-x86_64-with-glibc2.29
[cli][debug] Python:     3.8.10
[cli][debug] Streamlink: 3.2.0+13.g59e847e
[cli][debug] Requests(2.27.1), Socks(1.7.1), Websocket(1.3.2)
[cli][debug] Arguments:
[cli][debug]  url=htv.com.vn/truc-tuyen
[cli][debug]  --loglevel=debug
[cli][debug]  --player=mpv
[cli][debug]  --http-proxy=socks4://localhost:9050
[cli][info] Found matching plugin htv for URL htv.com.vn/truc-tuyen
[plugins.htv][debug] channels={'1': 'htv7', '3': 'htv9', '10': 'htvtt', '14': 'htv4'}
[utils.l10n][debug] Language code: en_GB
Available streams: 1080p (worst, best)

Valid channel ID URLs (at time of PR) in post below.

closes #4225

@mkbloke
Copy link
Member Author

mkbloke commented Apr 5, 2022

After all that, I notice that it is possible to access via canonical URLs, but not all seem to be linked to. On the front page:

https://www.htv.com.vn/truc-tuyen?channel=1 # linked to via HTV7 channel icon
https://www.htv.com.vn/truc-tuyen?channel=3 # linked to via HTV9 channel icon
https://www.htv.com.vn/truc-tuyen?channel=10 # works, not linked to, HTVTT channel icon links to another page
https://www.htv.com.vn/truc-tuyen?channel=14 # works, not linked to, HTV4 channel icon links to schedule page for channel

I should look for these URLs in the plugin and do the right thing if they are supplied, but it looks like the command line option for channel selection will also have to stay. Urgh.

@mkbloke
Copy link
Member Author

mkbloke commented Apr 5, 2022

Thinking about the above, what should take priority: the supplied canonical URL when it includes channel= or the command line option?

Copy link
Member

@bastimeyer bastimeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody commented on #4225 (comment)

Plugin channel selection CLI arguments should not be added.

If there's no canonical URL available for specific channels, then add selection logic via the URL's hash data, which is always optional. As you mentioned though, there are links for the channel selection, it's just the site's player implementation that doesn't update the URL's query string when switching the channel.

@mkbloke mkbloke force-pushed the htv branch 2 times, most recently from 6002515 to 9acae30 Compare April 5, 2022 22:53
Copy link
Member

@bastimeyer bastimeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Some improvements regarding the channel selection. The id name shadows builtins.id, empty lists/dicts are already considered non-truthy and the selection logic can be simplified.
  2. validate.xpath allows for xpath 2.0+ queries compared to validate.xml_findall, which is required for checking the right anchor elements as descendants of the .channel-list element.
  3. plugin matcher test can be refined / improved
diff --git a/src/streamlink/plugins/htv.py b/src/streamlink/plugins/htv.py
index ece1d3f0..a23205c3 100644
--- a/src/streamlink/plugins/htv.py
+++ b/src/streamlink/plugins/htv.py
@@ -17,7 +17,7 @@ log = logging.getLogger(__name__)
 
 
 @pluginmatcher(re.compile(
-    r"https?://(?:www\.)?htv\.com\.vn/truc-tuyen(?:\?channel=(\d+))?$"
+    r"https?://(?:www\.)?htv\.com\.vn/truc-tuyen(?:\?channel=(?P<channel>\w+)&?|$)"
 ))
 class HTV(Plugin):
     hls_url_re = re.compile(r'var\s+iosUrl\s*=\s*"([^"]+)"')
@@ -25,15 +25,13 @@ class HTV(Plugin):
     def get_channels(self):
         data = self.session.http.get(self.url, schema=validate.Schema(
             validate.parse_html(),
-            validate.xml_findall(".//a[@data-code]"), [
+            validate.xml_xpath(".//*[contains(@class,'channel-list')]//a[@data-id][@data-code]"),
+            [
                 validate.union_get("data-id", "data-code"),
             ],
         ))
 
-        if not data:
-            return
-
-        return {i[0]: i[1] for i in data}
+        return {k: v for k, v in data}
 
     def _get_streams(self):
         channels = self.get_channels()
@@ -44,17 +42,16 @@ class HTV(Plugin):
 
         log.debug(f"channels={channels}")
 
-        channel_id = list(channels.keys())[0]
-        channel_code = list(channels.values())[0]
-
-        id = self.match.group(1)
-        if id and id not in channels:
-            log.error(f"Unknown channel ID: {id}")
+        channel_id = self.match.group("channel")
+        if channel_id is None:
+            channel_id, channel_code = next(iter(channels.items()))
+        elif channel_id in channels:
+            channel_code = channels[channel_id]
+        else:
+            log.error(f"Unknown channel ID: {channel_id}")
             return
 
-        if id:
-            channel_id = id
-            channel_code = channels[id]
+        log.info(f"Channel: {channel_code}")
 
         json = self.session.http.post(
             "https://www.htv.com.vn/HTVModule/Services/htvService.aspx",
@@ -66,7 +63,8 @@ class HTV(Plugin):
                 "date": date.today().strftime("%d-%m-%Y"),
             },
             schema=validate.Schema(
-                validate.parse_json(), {
+                validate.parse_json(),
+                {
                     "success": bool,
                     "chanelUrl": validate.url(),
                 },
diff --git a/tests/plugins/test_htv.py b/tests/plugins/test_htv.py
index 18c18149..ac1306eb 100644
--- a/tests/plugins/test_htv.py
+++ b/tests/plugins/test_htv.py
@@ -5,28 +5,20 @@ from tests.plugins import PluginCanHandleUrl
 class TestPluginCanHandleUrlHTV(PluginCanHandleUrl):
     __plugin__ = HTV
 
-    should_match = [
-        "http://htv.com.vn/truc-tuyen",
-        "http://www.htv.com.vn/truc-tuyen",
-        "http://www.htv.com.vn/truc-tuyen?channel=1",
-        "http://www.htv.com.vn/truc-tuyen?channel=456",
-        "https://htv.com.vn/truc-tuyen",
-        "https://www.htv.com.vn/truc-tuyen",
-        "https://www.htv.com.vn/truc-tuyen?channel=1",
-        "https://www.htv.com.vn/truc-tuyen?channel=456",
+    should_match_groups = [
+        ("https://htv.com.vn/truc-tuyen", {}),
+        ("https://htv.com.vn/truc-tuyen?channel=123", {"channel": "123"}),
+        ("https://htv.com.vn/truc-tuyen?channel=123&foo", {"channel": "123"}),
+        ("https://www.htv.com.vn/truc-tuyen", {}),
+        ("https://www.htv.com.vn/truc-tuyen?channel=123", {"channel": "123"}),
+        ("https://www.htv.com.vn/truc-tuyen?channel=123&foo", {"channel": "123"}),
     ]
 
     should_not_match = [
-        "http://htv.com.vn/",
-        "http://htv.com.vn/any/path",
-        "http://www.htv.com.vn/",
-        "http://www.htv.com.vn/any/path",
-        "http://www.htv.com.vn/truc-tuyen?channel=x",
-        "http://www.htv.com.vn/truc-tuyen?other=1",
         "https://htv.com.vn/",
         "https://htv.com.vn/any/path",
+        "https://htv.com.vn/truc-tuyen?foo",
         "https://www.htv.com.vn/",
         "https://www.htv.com.vn/any/path",
-        "https://www.htv.com.vn/truc-tuyen?channel=x",
-        "https://www.htv.com.vn/truc-tuyen?other=1",
+        "https://www.htv.com.vn/truc-tuyen?foo",
     ]

@mkbloke
Copy link
Member Author

mkbloke commented Apr 6, 2022

1 & 2 are good to know. Regarding 3, I'd forgotten about using should_match_groups[].

Thanks.

@bastimeyer bastimeyer merged commit 83abb0f into streamlink:master Apr 6, 2022
@bastimeyer
Copy link
Member

Thanks!

@mkbloke mkbloke deleted the htv branch April 6, 2022 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

htv.com.vn / htv-livecdn.fptplay.net/htvonline
2 participants