What I learned from porting the scanner module
Introduction
I was recently given a requirement by a client in which they wanted content authors to be able to find and replace instances of words across potentially every piece of content in the site. This is a request that I've received in the past and have been able to solve the problem with relative ease. In Drupal 7 (D7) there is a module called scanner, which does exactly what they needed. Drupal 8 (D8) has been around for three and a half years at this point so I hoped that someone has gone through the effort of porting it so that I could simply download, enable the module, and show the users how to use the feature. Unfortunately for me there was no such option for D8. After doing some searching around on the internet the only option that had Drupal 8 support was a Drush script on a git repo, but this wouldn't work since I needed a UI. In the paragraphs below I will briefly explain several things I learned along the way.
1. Don't reinvent the wheel if you don't have to
Before I wrote a single line of code I went into the issue queue and looked to see if anyone had either requested or even started on a port of the module. As it turned out someone had started the work and attached a patch which they said was a very rudimentary first go at getting it to work. In the comment they explained that the search functionality was working and it would search through any node's text fields. After cloning the D7 code repo and applying the patch I was able to confirm that it did in fact work as they said. Following that I went and looked at how the old code worked so that I could get an idea of how the rest of the functionality should be achieved.
The first functionality that I got working was the various "search options" which helps the user target only the instances that they want to replace. The person who had done the initial port had written the majority of the functionality into a controller, so I (initially at least) continued that convention for handling the creation of the conditional clauses which used the "search options" selected by the user. Since the functionality for this part was just vanilla php code, I was able to more or less, copy/paste the D7 code with only minor modifications to get it working in D8. When I went to write the replacement functionality I took the same approach as I did with the options handling. I looked at the old code and I reused what made sense. I could have tried to come up with my own logic on how these functionalities worked, but instead I chose to borrow (both conceptually and literally) from the existing code. Doing this freed up the time and energy of thinking about how the code should work, and instead allowed me to spend my time just thinking how best to translate it into D8 code. I won't go into great detail about the exact code written to achieve all of the module's functionality. Those things can be found in the code itself and the followup article which will go into more detail and include code snippets.
2. Don't be afraid to look for and try out new ideas
I started working with Drupal shortly before Drupal 7 was released. D6 and D7 were pretty similar when it came to the api, so the transition wasn't all that jarring. While I certainly didn't I think I knew everything there was to know about the api, I did feel I had a pretty good grasp on the important/useful hooks and modules available. With the introduction of Drupal 8 a lot of that confidence went away when I started looking at how vast and sprawling the API had become. Even though I had written dozen of modules and even plugins (views, fields, conditional) in D8, I still felt as though I had only scratched the surface.
I write this to say that in spite of this, you shouldn't be overwhelmed by this vastness. Just like in D7, with each project you will learn new concepts and how/when to apply them. In the case of the scanner module port, I learned quite a few new things which I will use in the future. I learned how to write a brand new plugin and all it's parts (plugin manger, plugin interface, annotation, and the plugins themselves), I learned how to use the Batch api, and I learned about the Tempstore service which should be used instead of the $_SESSION superglobal variable to persist data across requests. In order to learn each of these new concepts I read some articles as well as watched youtube videos. Initially I was afraid of how difficult it would be to write a brand new plugin, but after reading a few articles and the documentation on drupal.org, my fears were allayed. It's possible that my use case was fairly simple and that's why I felt writing the plugin was as was easy, however I don't think my plugin is less complicated than the average one.
3. Ask for feedback from your peers
Once I got the find and replace functionality consistently working I did a demo and code walkthrough with one of my colleagues. Getting a second opinion almost always helps you either improve your code or at the very least deepens your understanding of your code as you explain it to another person. After I finished going through the code, my colleague asked several questions about why I did things the way I did. We went back and forth on these points and eventually on a few of them I agreed that his criticisms were valid and made sense, both from a logical as well as code standards point of view. Based on that feedback I changed my code. For example, the person who had done the rough port had written a form and then injected that form into the controller. Since they had done it that way, they had to put the form state values into the tempstore in order to access the values in their controller and use them in the search and replace operations. My colleague suggested that the controller was superfluous and that I should simply put all the logic in the form itself. By doing it this way I can get the form state values in the submit handler and not have to use the persisting of the variables approach.
I did a second demo with a group of my peers at our local Drupal meetup later that week and they too provided me with some ideas / improvements. The site I had been demoing the module on only had a small number of nodes (around 100). One of the attendees brought up the idea of batching the operations so that in the case of large sites the request would not timeout (or at least be less likely to do so). The Batch api is something that many modules in Drupal already use (config import/export, devel generate, tmgmt, etc), but I had never written any actual batch code myself. Before I attempted writing the batch functionality I tried running my code in the dev environment of the actual site where the module would be used. The site has about 5000 nodes and 2000 paragraphs with half a dozen fields on each. When I ran the code with just a few fields selected, I ran into the php execution limit by the forth or fifth field. After adding the batch code this issue went away.
Showing my code to others allowed me to improve the quality of my code and forced me to think in different ways.
4. Get in the habit of writing small chunks of code and testing often
Often times we when we want to write a new module or plugin we get caught up with often write the whole functionality before testing any of it. This is a bad habit which I have worked on over the past few years, and feel it has increased my productivity and efficiency greatly. This principle may seem obvious, but it's not always what we do. If I had sat down and written the search/replace/undo operations in one go without testing it, I would likely encounter a bunch of bugs in my code. I would then have to try and track down which part of the code was culprit in the large code base. If I instead simply write the search functionality, then the replacement function, and finally the undo operation, with testing after each of those steps debugging becomes much quicker and simpler. This idea could (and in my opinion should) be applied even further within each of the previously mentioned steps. Instead of waiting to finish the entire search feature I can write and test each of the "search options". With this approach for example, after I'm done writing and testing the case sensitivity code I can certain that it will work correctly in combination with any of the other options that I might choose. The same goes for the replacement operation, I can first write and test the node entity's replacement method before writing the paragraph or taxonomy term entity's replacement code.
At this point in software development, everyone should be using some kind of version control. In the same vain of small incremental code writing and testing, you should do the same with committing your code. I suggest that after you've tested a feature and are satisfied that it works you should make a commit containing those changes right away. Doing this has several benefits including: the ability to revert to a previous state if your current changes break something or don't work, having a commit history that you can come back to days, weeks, or months later and quickly remember what's been done, and finally if you're using something like github.com it allows you to have a central location that you can both store the information in the event of a data lose and allows for others to see and utilize your code.
Further reading:
If you are interested in the details of I how to went about transitioning the existing codebase and actual code you can read about it here.